## Advanced Methods for Primary Care Research–The Stepped Wedge Design

To briefly review the agenda for this webinar. First I will explain how to submit a question to our presenters. Then Rebecca Roper, director of the Practice-Based Research Network Initiative at the Agency for Healthcare Research and quality, AHRQ, will introduce today’s presenters. We will then hear from our presenters about their experiences with stepped wedge design. We will have a question-and-answer session with all presenters after the final presentation. At the end of the webinar I will explain how to obtain CME credit for participation in this webinar. Please note that after today’s webinar, a copy of the presentation slides will be e-mailed to all webinar participants. If at any point during this webinar you have trouble hearing our presenters, please try hanging up the phone or headset and dialing back into the webinar. Please note that none of today’s presenters will discuss off-label use and/or investigational use of medications in their presentations. Dr. Dickinson received some funding from the AHRQ In-Step Study but is not the PI. To submit a question, you may use the GoToWebinar control panel. Type a question under the question section and hit “send,” as shown in the screen shot on this slide. You may submit a question at any time throughout the presentation. During the Q&A session, as time allows, your questions will be read out loud, and our presenters will respond. I will now turn the presentation over to Miss Roper. Thank you Christina. So, in the past year, among the Practice-Based Research Network community, the topic of the methodologies for conducting step-wise design, or even selecting the appropriateness of the design, has been raised to us, and we were in a quest to find some presenters, and when I went to the fall North American primary care research group conference one — I believe it was a Sunday afternoon, I had the pleasure of watching a very detailed workshop that was framed by and delivered expertly by Dr. Dickinson, Dr. Bartlett, Chris Meaney and Dr. Kwan, and they have graciously agreed to provide to us a slightly modified presentation that they will give today to go over all the advanced methods, but basic elements of stepped wedge design and considerations and both articulating what your research objective is and matching that with the analytical plan within the stepped wedge design. For your information, after the 90-minute presentation that we will have here live, which will be recorded and made available for others in a few weeks on the PBRN website, this group of wonderful presenters is going to have a separate presentation that we will record and provide to you that really provides examples of stepped wedge design and primary care studies. So we will look forward to their presentation today, as well as providing you copies of the PowerPoint after today’s presentation and an opportunity, at your leisure, to come back and view today’s presentation, as well as the supplemental material that has opportunity for more detailed examples. So, as I mentioned, there are four presenters today. We have Dr. Gillian Bartlett who is associate professor and research graduate program director for the Department of Family Medicine in the Yale University. Dr. Bartlett specializes in primary care research and knowledge translation. Her research is focused on health informatics, population health, pharmacoepidemiology, research methods and evaluation methodologies, complex data in primary care. And I’m going to give an introduction to all four presenters and then hand it over to them. Our second presenter in sequence is Dr. Miriam Dickinson, who is in the Department of Family Medicine at the Accord Center for Health Outcome Research at the University of Colorado Denver, and senior scientist for the National Research Network of the American Academy of Family Physicians. She brings expertise in [indiscernible] design and the application of complex analytical methodologies to the challenges associated with practice-based research and cluster randomized pragmatic trials. Our third presenter is also from Canada. Chris Meaney is a biostatistician with the Department of Family and Community Medicine at the University of Toronto. He continues to collaborate with clinicians, epidemiologist, and biostatisticians in the fields of primary care research, neurology, hepatology, and injury prevention. And our fourth presenter is Bethany Kwan who is with the University of Colorado School of Medicine. She is a social health psychologist with research interests in health behavior change in primary care. Her objective with her career is to improve quality and effectiveness, behavior change interventions in primary care, based on a platform of patient-centered outcomes research, health-care informatics, and behavior theories. And having had the pleasure of seeing the specificity and devotion that each of these presenters have, though not articulated in their brief bios, I would say that they excel in their desire and capacity to articulate to researchers such as us the complexities of methodologies in such a way to inform our thought process and designing research studies and the selection of the appropriate analytical method that is matched with those research studies with an ability to understand that the findings have to be in a context that can be usable and translatable to all. So with that, I will turn the presentation over to our presenters. Thank you very much. My name is Gillian Bartlett. Sorry. My name is Gillian Bartlett. I’ll be presenting the first section with my colleague, Miriam Dickinson. I am very pleased to see so many people listening in on this webinar. This was a result of a workshop that was hosted by the Methods Working Group that is one of the working groups under the committee for the advancement of the science of family medicine. And this is a really important initiative to promote the methodologies in what is undoubtedly a complex area, and that is primary care. We are specifically talking today about a design that has been used extensively in practice-based research, and most of that research within the [indiscernible] context, and likely within the American context, takes place within primary care in different family medicine settings, so I hope that this will be helpful for those of you taking on this complex and advanced type of research. So if I could have the polling questions launched, and if we could move to the first — the next slide. Gill, sorry, we’re going to launch the poll, and then when the poll concludes, we’ll move to the next slide. Perfect. Okay, is the polling done? Yes, so we can move to the next slide now. All right. So the question for the poll has been — we’re trying to establish the audience’s experience with stepped wedge design. I see it’s very nicely divided. We’ve got a third that have no experience, another third that have read or heard about it, roughly a quarter — or a fifth, sorry, are planning to use it, a very small percentage have used it in a study or who are currently using it in a study. So this is an interesting mix. So it fits very well with our educational objectives of this webinar. We are hoping that at the end of this webinar you will have an understanding of the basic design of a cluster randomized stepped wedge trial; have a better understanding of how randomization works in the stepped wedge design, how enrollment and measurement are done with these designs, and my co-presenters will be going over the implications for three design variations. We will then briefly touch on the principles of statistical analysis for the design variation, keeping in mind one size does not fill all, and we’ll go into some of those details. Then we’ll be talking about power and sample size in stepped wedge design; that’s always challenging. And finally, in the summary, we will talk about how you might select a stepped wedge design, and we’ll do that. Give you some of the advantages and disadvantages of the different designs and reasons you may have for considering this. So, with that, I will turn the remaining part of this section over to my colleague, who is co-chair for the Methods Working Group, Miriam Dickinson. Dr. Dickinson, are you on mute. We can’t hear you. Can you hear me now? Yes. Okay, good. It took a moment to get that right. Okay. So in this presentation we’re focusing on the cluster randomized stepped wedge design, as opposed to a individually randomized stepped wedge design, which also exists. This design is a variation of the crossover design, in which clusters cross over from the control phase to of the study to the intervention phase. In PBRN studies, generally clusters are primary care practices. So I’ll use that interchangeably from here on. So in the pictures below, you can see these three common designs. The first is the parallel group cluster randomized trial, which we use a lot in PBRN research. And then in this approach, practices are randomized to generally a control or an intervention group, and either they receive the intervention or they don’t receive it. The second is a traditional crossover design in which practices are randomized to an initial condition, usually they’re control or intervention, again, and then they cross over at a specified time period. Now the cluster — the stepped wedge cluster randomized design, all clusters start out in the control phase, and you can see that with all the little zeros, at time point one, and at each time point or step, one or more of the clusters crosses over to the intervention phase, and once they’re in the intervention phase, they don’t cross back. Okay, next slide, please. All right. In the step wedge design, practices are not actually randomized to a study group as they are in most of our designs, but they’re randomized to an order. So we usually use something like a random number generator, which you can find in Excel or most common statistical packages to assign practices to an order. The investigator then has to decide on a number of steps, usually based on feasibility and study constraints, and a designated number of practices will cross over to the intervention phase at each step. This order determines when, not if, the practice receives the intervention. That’s an important feature of the stepped wedge design, and what distinguishes it from many other designs. By the last time block, all practices will be in the intervention phase. And just one thing to mention, it’s difficult to use blinding in this kind of approach. Next slide, please. Now we’re going to talk about enrollment and measurement. Traditionally all clusters are recruited and enrolled at baseline, and followed for the entire duration of the study, but there is an alternative approach possible when retrospective data are available. We’ll give examples of both. Outcomes are generally measured for every time block or cell for every cluster, or in this context, for every practice. Usually the practices participate in the study for the entire time period, and this can be a bit longer sometimes than a traditional parallel group cluster randomized trial. So in the next few slides we’ll highlight and discuss distinctions between three key variations on this design. Next slide, please. Okay, the first design that we’ll look at, we’re just calling it, arbitrarily, design A, is usually referred to as a repeated cross-sectional design, and in this design, clusters cross over, but individuals are designated as either control or intervention, depending on when they’re enrolled in the study. Basically individuals enroll during the control phase for that cluster are control subjects, and individuals enrolled during the intervention phase are intervention subjects. The key point here to keep in mind is that control and intervention groups consist of different people. And time and study is usually about the same for all individuals, regardless of when they were enrolled. They may participate in the study for only a very short period of time, sometimes just a single observation, sometimes with a very short follow-up period, and, in fact, in this design, if the follow-up period goes on too long, then contamination can be the problem. Okay, next slide, please. Okay, good. In this design, which we arbitrarily call design B, a cohort of individuals is identified at baseline and followed throughout the entire study. Because of — I think we’re one slide too far into this. Yes, there we go. Okay, that’s perfect. And the distinguishing feature here is that the same individuals are in the control and intervention phases, so the clusters cross over, and when the clusters cross over, the individuals also cross over. So individuals have both control and intervention conditions. This is usually referred to as a cohort design. One of the features of this, and one of the difficulties sometimes, is that you have to be able to identify, track, and measure these individuals over a longer period of time, and we often use things like repeated surveys or some sort of direct measurement over time. Sometimes longitudinal data from EHRs can work for that. Now we’re ready for the next slide. So this next one is really just a variation on design B. And in this situation often the — and especially in the example that we’ll use in a little while, there’s a larger unit of randomization, such as a geographic region, and regions cross over from control to intervention, again, based on randomization order. But the distinguishing and unique feature here is that because of available existing data, say, from an EHR or some sort of health record to ascertain outcomes, practices can be recruited just prior to implementation of the interventions within their region and not at baseline. So that’s an interesting twist on this cohort design. And, again, the example we use here is a cohort design. The same individuals are in the control and intervention period. And individuals, as well as clusters, are followed throughout the entire study period, but they’re followed using electronic health records rather than having to actually involve them in the study. Okay, next slide, please. So this example is a study. It’s called the INSTTEPP Study, Implementing Network Self-Management Tools Through Engaging Patients and Practices. And the overall goal of the study was to implement the AHRQ SNS Library toolkit, and we had four participating PBRNs. So the mechanism was to use something called a boot camp translation in a stepped wedge design, and the boot camp translation process resulted in tailoring the intervention for the different PBRNs. And then at the end, the idea was to evaluate the impact of the interventions on patients and practice staff engaged in chronic care management. The setting was four PBRN networks, and each network contributed four practices to the study. And at the beginning, at baseline, prior to any work at all, the networks were randomized to an intervention initiation time or an order. Next slide, please. So here you see something like what we saw earlier, and this is just this typical design, where at time block one all those PBRNs and all the practices within all the PBRNs are in the control phase. At time block two, the first one switches over to intervention. The rest are in control. Time block three, we have two in the intervention phase, and so on, until the last time block, when all practices are in the intervention phase. Next slide, please. So now the target population here is patients ages 18 to 70 with chronic illness, and these are patients who are being targeted for care management support. So how does this study look for patients? What we did here was during each time block, 16 patients from each PBRN are recruited and enrolled. That was about four per practice. And each patient at that time completed a baseline of one month and a two-month assessment. So that patient follow up is fairly short. It’s only about eight weeks, and that’s really important for this repeated cross-sectional design. And once that’s done, once that follow up is completed, the patients are finished. They’re not involved in the study anymore. The primary outcome is the patient activation measure, or the PAM, and we were expecting this change to occur fairly quickly, so that’s an important condition of this design. The patients are designated either as controls or intervention patients depending on whether the practice was actually in the control phase or whether the practice was in the interventions phase at the time the patient was enrolled. Okay, next slide. So this slide really shows kind of recruitment goals and the patients enrolled during the control phase receive usual care, and the patients enrolled during the intervention phase receive the intervention. The intervention patients are in red in that table below, and something that really important in this design is that if you don’t meet recruitment goals early on, or later on, then you won’t have the right number of controls and intervention patients, so it’s really important to keep practices on target with their recruitment goals. What we wanted to know here was whether improvement over that two-month assessment period was greater for the intervention patients than for the control patients. So that intervention effect here in this study is a between-patient effect. Next slide, please. Now here’s the statistical model that I use for this particular study. And I tend to like to write them as multi-level models, because it’s clearer, then, exactly where the effects are happening. So within patients we have three observations, and that’s just modeled as a function of time. The level-two model, first we have the intercept as a function of the baseline values, the intervention effect, which is just the difference between intervention and control patients at baseline, and then we also have a temporal trend effect, that beta not two J times month allows us to model temporal trends. Now, what’s really of interest here is the slope, and you’ll see that in the second line of the level-two model, where beta one not J is the slope for control patients, and beta one-one J is the slope for intervention patients. And then we carried that on into the practice-level model. So if you’ll go to the next slide, I have a picture — next slide, please, yes — I have a picture of what that looks like, or at least what the hypothesized relationships look like. So the blue line on the left, it’s control zero, would be representative of the control patients during that very first time block. So they may improve a little bit. They, you know, maybe, maybe not, who knows. But we definitely allow for that possibility that they’re going to improve a little bit over time in their PAM scores. If you go over to one, control one and intervention one, at time block one, there’s a green line and a red line. So the red line on the bottom represents the trajectory for control patients in time block one. So that’s that first PBRN. And the first PBRN — actually, the first PBRN has already switched over to the intervention phase, which is the green line, and the red line is the rest of the PBRNs. So you see there’s a different slope. There’s greater improvement shown in the green line, which would be the intervention patients, compared to the red line, which is the control patients. The same thing happens at the next time block. There’s a purple line and another blue line. The blue line is all the intervention patients in that time block the purple line is the control patients. And, again, greater improvement in the intervention patients. And finally — I didn’t take them all out to every time block, but finally, at that very last one, which is the orange line, we only have intervention patients, because all of the PBRNs have crossed over into the intervention phase. Now there’s one more thing that’s kind of interesting here. If you’ll notice the starting points for each one of those time blocks is a little bit higher every time. That’s the temporal trend effect. Okay, next slide, please. Now we’re going to look at design B, and that’s the cohort design. So in the cohort design, that same study, a cohort of clinicians and staff involved in care for patients with chronic illness were recruited at baseline and followed throughout the entire period of the study. So the way we managed this was, during each time block we asked each clinician and staff member to complete a survey and turn it in, and what we thought was that after the boot camp translation and the implementations of the intervention, that attitudes towards patients’ chronic care, self management would improve. We measured that with a CSPAM. So the basic design is that clinicians and staff are in the control condition as long the practice is in the control phase, and once the practice crosses over to the intervention conditions, the clinicians and staff also cross over into the intervention condition. And contamination is less of an issue here. So let’s go to the next slide, and we’ll see what that looks like. So the outcome, as I mentioned, was the CSPAM, the Clinician Support for Patient Activation Measure. Each individual has both control and intervention periods. We hope to recruit at lease 20 clinicians and staff from each PBRN and follow them throughout the study. The blue represents the control phase, and the red represents the interventions phase for clinicians and staff; okay? Now let’s go to the next picture. And the statistical analysis, we can use general linear mixed models again. It’s parameterized a little bit differently. Some key differences from the design A model is that the intervention term is actually within individual effect. It’s actually a time varying covariant. And I’ve shown one illustration of this model below. There are some other possibilities, but in this one — we’ve seen this happen in some other studies. The idea is that as soon as the intervention takes place, is implemented — and in this case it’s that boot camp translation — there is an immediate effect, and that’s depicted by that vertical line, you see it at the crossover point for PBRN one. The trajectory stays the same, but there’s this little bump up at the time that the practice crosses over and the intervention is implemented. And then you see the same thing depicted for PBRN 2. There are other possibilities. You could have things like change in slope. You could have some combination of the two. And generally what we do is test a couple of those different possibilities and then use goodness of fit criteria to determine which is the best model; okay? Next slide, please. Now this study, this is an interesting study. This is really this variation on design B. It’s a Canadian study. It was Improved Delivery of Cardiovascular Care Through Outreach Facilitation, and the idea is that practice outreach facilitators would work with primary care practices to optimize cardiovascular disease prevention and management in high-risk patients. Now the unique feature here is that they randomize geographic regions in Canada to one of three intervention initiation times. There were three regions. There were actually three phases to the study, which made it a little bit more complex. There was baseline and intensive interventions, and then a sustainability phase. So if you’ll go on to the next slide, here is a picture of the study design as depicted. And you’ll have to go to the article to get more detail on it. Okay, next slide. So what does this mean for enrollment and measurement? The practices in each region were actually recruited prior to implementation of the intervention and not at baseline. So, in fact, they didn’t have to participate from the very beginning and go all the way through to the end. And the reason that was possible was that there was retrospective data collection so they could get outcome variable measurement from health records, from EHR data, rather than actually having to involve the practices. Another feature, which is a little unusual, is the measurement didn’t extend for the full five years. The data collection, their primary outcome was a quality of care composite score, and they accomplished this by doing repeated chart audits on a cohort, an identified cohort of are randomly select patient sample. And, again, there’s additional detail in the article if you’re interested in pursuing that. And now I’m going to turn this over to my colleague, Chris Meaney, who will talk about some other analytic issues and analytic approaches, and also about power and sample size. Chris. Thank you very much, Miriam. And just before beginning, I just would like to thank the AHRQ for inviting me to speak to you all today, and thank you all to all the participants for attending the webinar as well. So, as Miriam alluded to, the purpose of the next six slides is going to be to introduce and discuss a little bit more concretely popular statistical model for step wedge designs. In particular, I’m going to speak to the model that’s outlined in Michael Hussey and Jane Hughes’ 2007 controlled clinical trials article, which is entitled “The Design and Analysis of Stepped Wedge Cluster Randomized Trials.” So that’s one of the seminal articles on methods for stepped wedge designs, so that’s the reason I’m going to speak to that today. And just to be a little more concrete, the model that I’ll discuss today is one specific variant, and the power and sample size considerations as well are applicable to this specific variant. More specifically, the cross-sectional complete variant, where, essentially, patients provide a single measure at each time point J from each cluster I. So there’s no repeated measures on patients in this section of the talk. So it’s not the cohort model in any sense. So, just building upon what Miriam introduced earlier, in stepped wedge designs all of the clusters will eventually receive the intervention. But it’s the timing of the receipt of the intervention that’s really different from cluster to cluster. So the clusters are randomly allocated to sequences or order which they receive the intervention, and if you have — I guess, I clusters make it more concrete, say, four clusters, then you’ll have I plus one five time points, whereas the first time point everybody’s going to start off under control, and then as time progresses, each cluster will eventually switch to the intervention until the last time point, when all clusters will be in the intervention arm of the trial. So, yeah, what are some of the statistical issues in these kinds of designs? Basically the main statistical issue in this type of design is that you don’t have independent data. So even in the cross-sectional variant, you still have patients that are nested within clusters. So the implication of that is that people in the same cluster will essentially have responses that are more similar than would be expected under kind of under an independent sampling model. The cohort variant, which Miriam alluded to earlier, is even more complex. There would be a second level of hierarchy, that you basically have repeated measures nested within patients who themselves are nested within clusters. I’m not going to discuss power in relation to the cohort variant. I’m going to stick to power and sample size in relation to this cross-sectional variant. So now, discussing basically what we see here on the slide, we have a model for a response YIJK, which is essentially modeling some response for a person K at time J from cluster I. And we’re basically going to say that this response is a function of a number of parameters and the parameters are on the right-hand side of this equals side. And the parameters in this model by 38**Hussey and Hughes are basically new, which is a grand mean or intercept. Alpha I, which is a random cluster effect, beta J, which is a vector of six time effects, so if you have key time points, it will be T minus one elements in that cluster, so it’s kind of just a simple semi-coding, setting your groups to zero and making your inferences about time relation to that semi-group. And then there the term XIJ data, and XIJ is basically a vector of either zeros and ones, and the elements of the vector is equal to one if the person is in the intervention group at time J from cluster I also equals to zero. And theta basically is going to denote the estimated treatment sect from this model, and that tends to be the parameter that most people are interested in making in inferences from, and lastly, we include an epsilon I for JK term, which is just your basic residual noise, reflecting the fact that the given points of data aren’t actually going to lie completely on the planes specified by the model. So that’s the model we’re talking about. It’s simple, to a certain extent, and it can be expanded. James Hussey suggests a couple ways to expand the model. One way would be to include cluster by the time interaction. Another way would be to include cluster by treatments interaction. Those are going to basically capture whether or not the treatment effect there are raised by cluster. Another possible extension might be treatment by time interaction, so that will specify whether the treatment effect varies by time. And you could also include terms that might capture lag or delays in treatment effect that could be expected, and as well, could include terms to capture possible seasonal effects if your response has some kind of seasonality dependent in it. So, again, this is for — this model is generally for a complete cross-sectional design. The statistical and power considerations would be a bit different for a cohort design. There’s been a recently published masters thesis from the University of Pittsburgh on how you would deal with power and sample size for the cohort design. As well, there’s a number of extensions recently published in 2015 in “Stats and Mets” (sp)that deal with other subtle extensions of the design; for example, incomplete designs, where observations aren’t measured on every cluster or at every time point. So you can extend this model, but this the probably the most popular model that exists in the literature to date for modeling responses from a stepped wedge design, and that’s the one that we’ll discuss further. So if we could just advance to the next slide. So this slide has a little bit of notations, and basically what this slide does is specifies more concretely the additional assumptions that are embedded in the model discussed by Hussey and Hughes. So essentially the model on the previous slide is a linear mixed model, if you response is kind of continuous Goljan, or if it’s some other response like binary or account data, it could just be a general linear mixed model. And what that essentially means is that your model includes both fixed effect terms and random effect terms. So your fixed effect terms would be things like your intercept and your fixed time effects and your estimate of the treatment effects. And there’s also random effects included in the model. For example, the random cluster effects and the residual variation. So what are some of the assumptions embedded in this model? So one of the assumptions we make is that the random cluster effects alpha I are normally distributed with variant tau square, and it means zero. The residual noise the epsilon IJK terms, are assumed to be distributed according to a normal distribution as well. Also with a mean zero, and in this case with a variant sigma square. We further assume that the random effects are independent of the residual error, and all of these things are necessary, essentially, for estimation of the other model parameters, and if we make these assumptions, we can then derive things like variances that are useful for getting at kind of test statistics, and under these assumptions, we have the variants of response being equal to the sum of the cluster effect variances plus the residual variances, so the variants responses equal tau squared plus sigma squared. And we can also define an inter-cluster correlation coefficient, which I have denoted here row, which basically relates to the proportion of total variations that can be explained by the cluster level effects. So if we can just switch ahead to the next slide. So when we’re doing a stepped wedge design, it’s not all that different from any other kind of trial. Usually we have a treatment effect and our chief target of interest is whether or not the treatment is effective or not. So under the parameterization of Hussey and Hughes’ model, that basically amounts to testing whether or not beta is equal to zero or null hypothesis, versus data being not equal to zero, or alternative hypothesis. So, one way to go about doing that when you’re using kind of a regression-based approach, which we’re going to propose you do, is to use what’s called a Wald statistic for inference, which is essentially just the ratio of the treatment effect over its estimated standard error. So that Wald statistic, if you have kind of a large-ish trial is asymptotically normally distributed, and it becomes a pretty nice and convenient way for making inferences about whether or not that treatment, in fact, is zero or not. So if we just go to the next slide, one thing that is a little subtle in what I just said is that one has to kind of assure that the trial is large enough, so the Wald test might not be entirely optimal if you don’t have a large trial, and stepped wedge designs that can mean a lot of things. In general, a large trial here could mean large in terms of number of clusters recruited, large in terms of the number of time epox that you measure responses on people, or it could be large in terms of the number of individual samples from a given cluster at a given time point. So, ideally, for this time type of inferential machinery report you want some kind of a large trial. And in Hussey and Hughes’ article, they give some very nice graphs as to how power varies as a function of some of those input parameters. So here we’re basically just saying that our test statistic follows a normal limiting distribution, and one of the challenging aspects of inference for these types of models is kind of estimation of the variants of the response, and that nasty equation at the bottom of the slide was derived in the Hussey and Hughes’ article, and it relates to the power of the study. But I won’t say much more about it in terms of the mathematical nature of it, but we can switch to the next slide and I can talk a little bit more into the intuitive nature of the equation. So power and stepped wedge design, what is influencing power in a step wedge design? There’s a number of factors, really, that influence power. One is the strength of the treatment effect. So intuitively if you envision your study having a large effect, it will have more power than a similar study with a smaller treatment effect, again, and that’s pretty intuitive. As well, the next three bullets basically speak to the size of the study, and again, size can mean a number of things in a stepped wedge design. It can relate to the number of clusters. It can relate to number of time steps. It can relate to the number of participants recruited or cluster per time step. So essentially, as you jack any of those parameters up, you’re increasing the amount of information in your sample, which is going to increase the power. The last thing that one might consider as well, with respect to power calculation for a stepped wedge design, is the magnitude of the variance component. So that relates to the magnitude of the tau squared term and the magnitude of the sigma squared term. And the way I think about those is essentially like this is sigma squared or the residual error essentially relates to noise. So as you increase the Sigma-squared term, you’re essentially increasing the variance of your response, so an increase in the variance of the response is going to essentially decrease the magnitude of your test statistic and decrease your power. And then the other term is the Tau-squared term, which is your variance of your cluster level effect term. And the way I think about that is, again, kind of related to information. The extent to which that rises relates to the extent to which people within the same clusters look more and more alike. And as people from the same cluster look more and more alike, you are actually increasing the amounts of effective individual information that you’ve been able to recruit in your sample, so as that increases, power will also decrease in your study. Another interesting feature that Hussey and Hughes investigate is that of lag time between treatment effects. And, in general, if you expect a long lag between the application of your intervention and that treatment effect is realized, you’re also going to have lesser power compared to if that treatment effect is realized immediately after its introduction. So, really, power and stepped wedge design is a complicated matter. As this slide kind of points out too, power is really a function of a number of parameters, although, some of the investigators should be able to get at, given their specific study. For example, resources, funding depths will align, size of the community, et cetera, may impose certain restraints on the number of clusters, the number of time points, the number of individuals that one could recruit. So that isn’t necessarily an entirely free parameter in estimating sample size. If you’re able to specify a small number of points on a discrete grid, you should be able to get at certain estimates of how power varies as a function of those inputs. As well, investigators should have some idea of the effect size that one would expect given the intervention they plan to introduce. So that should be not too hard to get at. What tends to be the hardest parameters to estimate are the variance components for the residual errors and the variance components for the cluster-level effects. When I’m doing sample size calculations for these sort of things, I usually Iiaise my clinical colleagues and try to get them to specify a range of plausible values for each of these input parameters, and then essentially just calculate power as a function of all possible combinations of these points, and then essentially plot out how power varies as a function of these input parameters. And, in fact, that’s what Hussey and Hughes did very well in their seminal article as well. So if we could just switch to the next slide I’ll just discuss some of the analytic options for stepped wedge design. So the models that we just recently discussed is really only a near-mixed model, and if you have non-normal outcomes you fix, then, that to a generalize when you’re a mix model. Another general approach in a generalized estimating equation model. In general, these can be described as regression-based approaches, the driving inference for treatment effects. The models are slightly different. There’s subtle distinctions between the GEE approach to inference and the GLMM approach the inference. Really, in GEE, the correlation between individuals and a cluster would be viewed as a nuisance, and we essentially are going to estimate marginal effects of treatment in the population after kind of acknowledging the fact that we treat the correlation structure as a nuisance and estimate that separately for treatment effects. Conversely, under kind of a general linear mixed models approach, that Miriam had introduced earlier, the correlation may be viewed as more interesting, and you may want to estimate these treatment effects conditional on random effects included in the model. They’re probably two of the most popular approaches to analyzing stepped wedge design, and they’re probably two of the most modern approaches to analyzing stepped wedge design. Neither is really right or wrong, and the choice between GEE versus GLMM, really is going to depend on the type of inferences study investigators want to make about treatment effects from their their research project. One thing, though, to consider about the GEE approach is that, typically, GEE is used if there is a single level of clustering. So it would apply well to the design that Hussey and Hughes specified, where you have individuals from within a given cluster, and that’s the dependency structure that’s imposed. If you have a more complicated design, like the cohort design, where you have repeated measures nested within, individual nested within clusters and now you have multiple layers of dependency, GEE can be used, but the availability of software solutions for GEE, under kind of multilevel design, is limited too, a little bit. So, yeah, why do we need to use GEE and GLMM? Essentially because the data are clustered from a stepped wedge design. For more complex designs with higher levels of clustering, it’s even more important that you use the proper methodologies, something like GEE or GLMM, to properly account for the non-independent responses data. That said, when you’re reading through the literature, other options might exist, or you might encounter other options for analyzing this type of stepped wedge data. A simpler alternative might be to try to adjust estimates of treatment effects by some estimate of the design effect from your study. This was once very popular, but as kinds of GEE and GLMM have become more broadly available to research community it’s being used to a lesser extent. Another analytic alternative is to basically collapse the data for individuals to that of a single cluster and time effect. That is also kind of a little less popular. It’s less powerful, because you essentially lose information by collapsing individual data to cluster-level data. As well, it kind of negates the possibility to adjust for individual-level factors in models. Another approach would be to estimate treatment effects in the presence of kind of best standard error approaches, something like Huber-White adjusted standard errors. A challenge with that is that one needs kind of large cluster sizes in order to achieve optimal performance. So, from my perspective, GEE and GLMM are becoming more popular and probably will continue to become more popular research methods in the years to come, and those would be what I would recommend for analyzing stepped wedge trial data. So, yeah, we basically introduced the most popular statistical model for inference in stepped wedge designs that you’ll probably encounter in the literature. That’s the model of Hussey and Hughes. The main statistical issue relates to correlated response data. The model is relatively simple, but as Hussey demonstrated, it could be extended in multiple ways. And, in general, I would say the literature on the stepped wedge design is really in it’s infancy. Anecdotally I really see many methodological papers published each month or year, and clearly, stepped wedge designs have a [indiscernible] utility and that’s why methodological thought is increasing in that area. So thank you, and I’ll pass the floor over to Bethany. I think before we move on to Bethany, we just want to launch two quick polls from Dr. Dickinson’s presentation. So, Dr. Dickinson, would you like to introduce those polls. Yes. Can you put them up on the screen. Okay, so this is really oriented towards the design. So the first one, if you have a practice-based intervention to enhance care coordinations for patients with type-two diabetes to improve hemoglobin A1C over time, which design would you choose? Would it be repeated cross-sectional or cohort? So please vote. Think about what we talked about in that section. Wow, the poll is up. Cohorts won. Yeah, I mean, that’s what I would tend to choose, and the reason is that it takes longer to observe the effect, generally. Now, you know, you could argue that and look at the timeframe and everything and perhaps make the other one work. But the time element is what’s important here, and the amount of time required to observe an intervention effect. Let’s go to the next one. Okay. Now this is a slightly different kind of problem. To study a pre-visit parent education intervention on initial HPV vaccination uptake in teens at their first eligible visit, which design would you choose? Okay, this one is very clear. It doesn’t seem to be much doubt about that. Repeated cross-sectional is what’s commonly used for these kinds of outcomes. Generally just a single time point at that visit, did they get vaccinated or not. But these help you think about the implications of study design and what your outcome is, what you’re trying to achieve, and how long it will take you to achieve it. Okay? Shall we pass over to Bethany? Yes, hello. Okay, thank you all for coming today and hope you’re still with us and paying attention. I’m going to talk to you about how you do you go about making the decision to use a stepped wedge design? How do you know that this is something that is a good idea for your research question? So typically people use the stepped wedge design to evaluate therapies, treatment interventions, when, really, what’s holding that intervention from some participants, you know, from a control group, would not be acceptable or really feasible. So we often see this in working with practices with communities that they say we really think this intervention is going to do some good, and we don’t think that we can really hold it back. It’s not acceptable to us. And so a stepped wedge design is one of several designs that is you could use in that particular circumstance, because all of the clusters do eventually get the intervention. Now we know in that design A there are still individual people that do not get the intervention, but their practices do, ultimately, or communities do ultimately get that intervention. Also, typically, the stepped wedge design is used to examine effectiveness or impacting real-world settings at the population level. No, go back, we’re not there yet. There you go. Thanks. It’s not an efficacy trial, so this is not what you would use if you were not for — if the intervention you were using was really going to be efficacious at the individual level. So specifically if your intervention is shown to be effective in a more controlled research setting and now you’re ready for a more large-scale type pragmatic trial dissemination/implementation type of trial, a stepped wedge would be a good option. Also, relatedly, if there’s lack of this definitive evidence of effectiveness but really a belief that the intervention will do more good than harm. So a couple of examples of that might be the use of care management or care coordination. You wouldn’t necessarily anticipate that providing a person to do care coordination would be harmful in any way, so that’s just an example of that. Okay, now next slide. Okay. So what are some of the specific motivations that people have mentioned in the literature for selecting the stepped wedge design? So given that they’re doing this effectiveness or impact trial, why stepped wedge amongst other sorts of pragmatic trial types of designs? And the motivations really fall into two categories; one being practical considerations and one being an ethical consideration. So, from a practical consideration, let’s say you really need to do your implementation and your randomization at the cluster level, so at the practice level, at the hospital level, at the community level. So if your intervention is something that is changing the way that care is delivered it doesn’t make sense to try to change the way that care is delivered for individual patients within a practice and require the practice to deliver care in two different ways. So, really, the whole practice needs to switch over to the intervention. Another practical consideration is if all clusters must or will receive the intervention. So sometimes this is really outside of your control. Sometimes a community or a state or a large organization has decided that they are going to implement an intervention one way or the other, and they will work with you to randomize the time at which that happens so that you can utilize this particular design, but it is not acceptable to them that clusters will never get the intervention. This can really help increase acceptability to the community of using the randomized design. So, as we noted, these clusters are randomized to when they will get the intervention not if, and sometimes that’s enough to get the community or your cluster to agree to even that much randomization. Another practical consideration is when you really need phased or sequential implementation. So a lot of our research teams or implementation teams of just a few people, and unfortunately we cannot be in many places all at once as much as we try, so you just can’t really roll out the intervention simultaneously across large groups of practices. So in a cluster or CT you have 20 practices, everybody is due to get the intervention right at the beginning, or the control, you really can’t roll out an intervention, in some cases, in all ten practices, especially when it’s a much more complex intervention like what Miriam was describing for INSTTEPP. Another reason that a phase or sequential implementation may be desirable is that INSTTEPP you can do sort of quality improvement of the intervention or its delivery before the next implementation phase. So if you realize in an earlier wedge, earlier step, that things are not work terribly well and you want to try to improve some of the delivery of the intervention, you have an opportunity to do that. So next page, please. Okay. So as I mentioned, there are also some ethical considerations for selecting a stepped wedge. This is consistent with the reason why people tend to use this design. So number one, the intervention is really believed to do more good than harm; that is this concept of equipoise, the clinical equipoise is really minimal. And equipoise is this idea that as long as there is this genuine uncertainty about the most beneficial treatment, there is really no ethical imperative to provide the better intervention to everyone. But in this case, we’ve said, like, we really think that the intervention will do more good than harm. And so it really isn’t ethical to randomize people or randomize clusters to not get the intervention. So, really, we assume that it would be unethical to withhold the intervention, because there is established effectiveness or we know it’s a vehicle standard. It’s just not being implemented well. Also, another ethical consideration is once the intervention is implemented it’s not removed. So you don’t take it away. You don’t force people to go back to a pre-intervention time point. Next slide, please. So those were all the motivations and reasons to use stepped wedge. What are some cautions? I think you probably picked up on a lot of these as you were listening to my colleagues give their parts of the presentation. We just want to caution you or make sure you’re aware of some of the considerations when selecting this design. I don’t know that I would call them cons so much, it’s just something to be aware of. So the stepped wedge can be more difficult, more complex certainly than a traditional parallel group randomized clinical trial. One of the things that has given me pause in the past is this really heavy data collection burden. As you noted, outcomes need to measured for every cluster at every time point. So that data collection burden can cause increased cost to you as the researchers. It certainly adds a lot of data that you need to protect, gather, bring into your own environment to analyze, and it certainly a large burden to the practices of the clusters that they need to work with you every time to gather that data. And it’s always the case that informed consent can be challenging in stepped wedge design. Given this data collection burden, you do need to get informed consent. That can be especially difficult. But this burden can be minimized if the data that you’re using come from existing sources of routinely collected data, such as electronic health records or public health surveillance systems. So maybe that need for a lot of data doesn’t matter if you are able to get your outcomes data from these existing sources. Something else that’s been mentioned previously and just want to bring into this list of cautions is the idea that your trial can be very long if there’s a complex implementation. So any time you’re trying to change the way practices deliver care, change workflows, that can take a very long time to actually put into place in the practice. It’s a little bit different from changing perhaps the prescribing regimen, just changing the type of medication that people use. Or the trial duration can also be long, if it takes a very long time for the intervention to influence outcomes. So, for instance, weight loss interventions can certainly take a very long time to actually have an effect on outcomes. People need to participate in an intervention sometimes that, you know, takes more than just one visit. And even when you have been able to implement your intervention, you may not have time to observe effects on your clinical outcomes, especially in design A. So, for instance, it can take several months to observe a change in hemoglobin A1C. A couple other cautions: Regarding internal validity, there’s a greater potential for contamination, especially in design A. Miriam mentioned this a little bit earlier. She also mentioned some challenges regarding sequence generation and allocation concealment. You can sometimes do blinding of outcome assessors, but that’s not always possible. And then finally, it can be in a practical design if you’re comparing multiple interventions. So, for instance, if you have multiple levels of an intervention, so you have the a control group and then you have a minimal intervention group, and you need more intensive intervention, that doesn’t really work very well. It work better if you have just two different interventions that you’re comparing or intervention versus a usual care or control. All right, so next slide. Okay. So, as I mentioned earlier, there are other types of cluster designs that you can use. And how do you really differentiate the stepped wedge design from a more traditional parallel group cluster randomized trial? And, really, the key difference is this idea of a crossover, that every cluster gets the intervention in a stepped wedge design. Where in parallel group, you’re either intervention or control during the trial itself. As I mentioned before, the stepped wedge design has a longer trial duration certainly compared to the parallel group cluster randomized trial. And while you can do sequential implementation and rollout implementation and rollout of the intervention in both designs, stepped wedge design really allows for more of a stepped implementation step by step in a more systematic way. And then finally, while both designs can be used to examine temporal effects, stepped wedge design, you can control for temporal trends analytically using those wonderful analytic approaches that Chris just described for us. And in a parallel group cluster RCT you can have a parallel control group to assess a temporal trend, but the stepped wedge design can really be better if you have concerns about history or seasonality effects that you want to examine longitudinally. So you can make sure that you have a time factor incorporated into your analytic approach in a stepped wedge design for sure. Okay? All right, next slide. Okay. So, in summary, the ideal circumstances for utilizing stepped wedge are when you have questions about reach, effectiveness, or impact to the population level and implementation, when you have a focus on shorter-term outcomes, especially design A, so as I already mentioned clinical outcomes can take a little bit longer to have an effect, but if you’re looking at behavioral outcomes, at process or intermediate outcome, so, for instance, how many people are receiving self management support; how many people are receiving counseling for some type of behavior change, how many people are being offered a new toolkit for managing their asthma, something like that. That’s a more processor immediate outcome, are people receiving the intervention. And other designs, so we talked about design B and there was — it used to be called design C, but we changed to be a variation on design B. In those cases you can accommodate longer times to observe an intervention effect. But certainly in design A we recommend focusing on shorter-term outcomes, and then also an ideal circumstance for stepped wedge over and above a group — a parallel group cluster randomized trial is when you have fewer clusters available. So as we mentioned, because of the repeated measures in a stepped wedge design, you can utilize fewer clusters and have more power. So if you only have eight to ten clusters available, for instance, a stepped wedge design can be a great option for you. Okay. And so that’s it for me. I think I’m going to turn it back over to Gill at this point. Thank you, Bethany. Okay, so we’re going to launch our final poll for this webinar, and I’m going to wrap up very, very quickly so that we can get into question and answers. So I’m interested in knowing — or we’re interested in knowing what would be your main motivation for using stepped wedge, if you’re planning to use it. So you can select one of the answers there, and we’ll look at the polling results. Okay, so it seems that it’s coming through that it really is the best method for your study. It’s interesting, that’s more than half the people are picking that. There is that concern about ethical issue in equipoise, and then there are those of you like myself that really like the method and want to explore using it. I imagine there’s a few biostatistician in that group. It’s a very fascinating method. So we learned a lot about the stepped wedge design, and I’m going to go right into the concluding slide. So next slide, please. Okay, so at the end of this seminar or webinar, you should basically have come to the conclusion that although it is not always the best design — and I think Bethany gave a very nice summary of what situations it does and doesn’t work in. What is really appealing about this, especially for practice-based research is that a potentially beneficial intervention can eventually be offered to all participating practices in our community, and this really helps you when you’re trying to engage those practices in your intervention research. The other point is it really is, in terms of ethics, when there isn’t equipoise, in other words, we’re pretty sure that this intervention is going to be beneficial. It makes sense to try this design, keeping in mind the limitations for the timeline for your outcomes. So I wanted to thank all of my co-presenters on this, and they also did a fantastic job as part of the workshop, and as co-chair of the methods working group, we’re really proud of this product, and I believe we can go to the question-and-answer session. Thank you all very much. Some of the questions have come in with responses, and we will be sharing them. Christina, if you could go to slide 25, and while you are going there, as it relates to a question that’s been posed, someone inquired as to which software packages, in addition to SAS, and particularly SPF SAS, has this type of functionality under it. And Chris shared that it is available up under SPFS Version 22, under the analyze tab, a mixed method. Is there anything else you care to share with the group, Chris, with respect to availability of preprogramed, stepped wedge? Yeah, sure. I don’t think there’s anything really out of the box that you could just throw data at from a stepped wedge design and just get inferences directly out of. But as generalized estimating equations and generalized linear mixed models become more mainstream in biomedical research, software begins to adopt these methods, so I don’t know how long ago it would have been that SPFS should have added them, but SPFS should have solutions for that, as does SAS, as does RS data, as do many of the large software packages for conducting statistical analysis of trial data. Great, thank you. So the question came up at about the time that the slide is before us, are all patients enrolled just as the beginning or at each time point, where one group is crossing over? And, Chris, would you like to provide the group your answer, and then perhaps Dr. Dickinson, if you want to add any additional reflections. Yeah, sure. The answer I shared over the webinar was essentially one of it depends. You could do it either way. One design variance that Miriam introduced, is one where you recruit patients, independent patients, at each time step from each cluster, so that would be more of a cross-sectional variance; whereas a different design, which Miriam introduced, is more of a cohort variance, where you recruit a fixed number of people at the beginning of the study and follow them longitudinally in time. And there’s more variance on each of those and kind of gave a reference to a useful paper published I think January 2015, in (Stats n That) that should help elucidate some of the distinctions between the cohort variants and the cross-sectional variants, some of the other variants. This is Miriam. Chris you described that very nicely. It depends on the design and the repeated cross-sectional, generally patients — clusters are followed throughout the study, but patients are recruited during each time block. And so you might have a situation for design A where each practice has to recruit, say, 10 patients, 15 patients per time block, and these are new patients. These are different patients. Not the same ones. In design B, in the traditional approach, you would recruit every one at the beginning, and follow them all the way through, and then when the cluster crosses over the patient crosses over. In the situations where you can ascertain outcomes from existing data, then you don’t have to do it exactly that way. You can look back and get retrospective assessment of their control period, and that’s on this slide, design B. In this case we recruited clinicians and staff at the beginning and surveyed them once during each time block. But you can imagine if you have EHR data, it would just take looking at the clinical measures during the control and the intervention phases. So that’s the big distinction between those two approaches, the cohort and the repeated cross-sectional. Thank you. And, Christina, if you could go to slide 30, a general statistical model for stepped wedge design. The question is with respect to the variants of responses. And the question is posed as follows: “I understand that the independent susception of the random cluster effects and the residual error makes the calculation of power sample size easier. But how realistic is it? If one can assume this, could we use another method for estimating this variance, the delta method, boot strap, or something else?” Dr. Dickinson, or does anyone care to respond? Generally — I’m going to ask Chris to also jump in here. Certainly you could use some boot-strapping approaches. That would be a possibility. I, in my experience, I have generally used the formula that Hussey recommends, on the other slide, for estimating variance, and that has worked pretty well for both dichotomous kinds of outcomes, yes, no, so where you’re looking at proportions and continuous outcomes. It’s a little complex because you have clusters. So you have individuals within clusters, and then you also have the time effect, so it’s — the IATC generally refers to individuals within clusters, but you have this time-block effect as well. And I think the Hussey approach works pretty well. There’s are several of us who have tried successfully. Chris, do you want to jump in here. Yeah, sure. I was going to take a crack at responding to it, but I don’t think I have a perfect answer myself. I think the boot-strapping approach that you alluded to using is a sensible option. But essentially the independence assumption, I believe, in my opinion, is used basically as a mathematical nicety so that when you’re deriving some of these power and sample size formulas, things work out a little nicer. In the example, I guess you suggested what if we cannot assume that the cluster level random effects are independent of the residual error. I suppose that’s a reasonable assumption and might arise in data modeling. But my guess at why Hussey and Hughes didn’t go that way in their paper would just be it would make things a little more difficult. Specifically, I guess you would have to specify the magnitude and the direction or, just in general, a functional form for the relationship between the residual variances and the cluster level, the variances of the cluster level effects and it would just add a — it’s not that it couldn’t be done, I think it would just add a later of complexity to it. It wouldn’t generalize as nicely as what they do here. I agree, Chris. I think it would be more difficult. The boot strapping might work the best, and when you have correlated residuals that you could perhaps incorporate that, but that’s difficult. That makes it far more complex than it already is, which is pretty complex. Agreed completely, Miriam. And that’s kind of what I was thinking, like the idea of heteroscedastic kind of residual variances might have come into play, whereas if you thought you knew the form of how the magnitude of the cluster level variances or the predicted cluster random effects kind of influence the magnitude of the residual, you could maybe go in some modeling that. But I don’t think that’s a general solution. It would have to be specific to the data at hand, I believe. Yeah, I agree. I’m going to ask a couple more questions, but before I do, I just want to remind folks, at the conclusion of this presentation, there will be instructions on how you can receive CE credit. And I’ll ask one question, and then I’m going to execute a poll for you to evaluate the webinar honestly, but we need to demonstrate unbiased response, the extent to which these type of activities are successful or not. So we have to show data to show in order to demonstrate support. So before I get there, another question was posed. Do we need to balance the data, like using propensity score matching? And Dr. Kwan, this question was posed midpoint of your presentation, so would you care to take a crack. If you’re speaking, Dr. Kwan, we’re not able to hear you. Sorry, I was on mute. It took me a while to get to that little point there. So do we need to balance the data, like using propensity score matching? I actually think this is more of an analytic approach. I haven’t heard of anyone doing this. Have you, Miriam or Chris? Well, actually, we have paid attention in the randomization process to gratification variables so that you don’t inadvertently get very different practices starting earlier in the process as far as intervention implementation versus later. So we have used that in a few situations to make sure that that doesn’t happen, and mainly use stratification approaches, where we require either similar practices or very different practices to either be closer together in the top half or the bottom half of the order. But we do it that way. So, yes, that can be an issue. And I think more work needs to be done in this area. There hasn’t been a lot of work done in this area. All right, thank you. So I’m going to execute one of our polls, and in the meantime, if, Christina, you could go to slide 22 that has the chart, and that, we’ll speak about that for a bit. So the question was posed with respect to whether or not one should actively try to improve the intervention with each successive Igroup. Right. So this probably goes back to the point I was making that you can use sort of quality improvement types of approaches as you roll out the intervention. And I would say that this is something you should do cautiously. It sort of depends on the purpose of your research. So I don’t know that I would recommend changing your intervention, but you might improve the implementation of your intervention. So if you’re finding that by trying to implement your intervention on paper, but it’s really just not working well, you’re not getting good uptake of using the intervention on paper, and you felt like you wanted to switch over to a web-based approach to delivering your intervention — there are a variety of types of interventions for which that might be relevant — that’s something you may consider. Or maybe you recommend that instead of having be intervention implemented by a clinician, that you have the intervention implemented by a navigator or a staff person. So those kinds of changes, depending on your research question, so if you research question is, you know, does having a clinician implement this intervention lead to good implementation? Well then, you know, you wouldn’t want to change that. But you can do some tweaks regarding implementation as long as it isn’t the basis of your research question. Thank you. So now I’m going to launch the poll. We’ll launch our two polls, whether or not these types of webinars are a trusted source of information around practice improvement or practice transformation. And just as a post note, when you exit this particular webinar, you’ll be asked to assess the value and the extent to which the content met your needs, so that’s another — your completion of those assessments will be greatly appreciated. And while this is executing, I’ll just let the presenters know this other question, then we can be prepared to answer it. If your main outcomes is a six-months follow-up call, do you have to collect data at each crossover time point, or can you collect just baseline data at S1, then data at each time point only for the cluster that crosses over then and finally a third time point, six months after the crossover for each respective cohort? Yeah, so as we’ve mentioned, one of the considerations for stepped wedge design is the time it takes for your main outcome to be realized. So if it takes six months from the time that your intervention is delivered, at an individual patient level for instance, and you had four, five, or six steps at which the different clusters would cross over to the intervention, you would have a minimum interval for your step of six months. So you would have a very long trial. I would not recommend utilizing stepped wedge for a study in which your main outcome takes six months to be realized and measured. This is Miriam. I’ll jump in very briefly. I echo that, and you have to keep separate in this repeated cross-sectional design the idea of individual follow up and cluster measurements, because they’re really two different things. So if individuals — the issue here is if individuals have to be follow up for six months and you want to do this repeated cross-sectional, there is the risk of contamination when the cluster crosses over, and most investigators are going to have a hard time protecting patients, say, in a practice where the cluster has already crossed over from the intervention. So that’s the big issue there. But in this design, patient follow up is one thing and cluster crossover and cluster time blocks are another issue entirely. In the cohort design you’ve got the same individuals from beginning to end. But that’s not the case in this repeated cross-sectional, where a cluster crosses over but you’re usually recruiting a new set of patients at each time block, and then following all the patients regardless of the time block for about the same period of time, you know, two months, one month, six months, which is kind of long. So does that answer that, I hope? Yes, I thought that was very good. I want to thank you all. For those who want to obtain CME credit, the instructions are on the screen right now. And our wonderful presenters are going to have about a five-minute break and put together a detailed examples that we will share with you via the Internet. But I thank you all very much, and I thank the audience members who posed such thoughtful questions, and on a Friday afternoon, spoke Greek with the best of them, and so thank you very much. And I just want to let people know that the PBRN community is open and welcomes new members, so if you care to join our listserv, please do. Please e-mail us at [email protected], and, actually, we will have our next webinar on March the 4th, with Larry Green and James Werner and Rebecca Etz to really talk about where PBRNs are today, how they’re fostering partnership for pragmatic and prompt resolutions and leveraging the development of research collaboration. It’s been a pleasure to listen to Dr. Dickinson and Chris and Dr. Bartlett and Dr. Kwan. We thank you all very, very much. Is there anything else Christina? I’ll just launch the final two polls for those of us who are still participating. If you could just quickly vote, we really appreciate your feedback, so we’ll just hold and keep this up for about ten more seconds. Thank you. And one more. So we really appreciate your feedback. It looks like we’re getting a lot of responses. Thank you all so much for taking the additional ten minutes to stay with us. So I’ll close this poll, and that’s all we have. Thank you, Rebecca. All right, thank you all.