arxiv: 2604.12900 · v1 · submitted 2026-04-14 · 📊 stat.ME · econ.EM

Recognition: unknown

Emulating Stepped-Wedge Cluster Randomized Trials to Evaluate Health Policies and Interventions

Fan Li, Gregg S. Gonsalves, Guanyu Tong, Haidong Lu, Lee Kennedy-Shaffer

Authors on Pith no claims yet

Pith reviewed 2026-05-10 14:26 UTC · model grok-4.3

classification 📊 stat.ME econ.EM

keywords stepped-wedge trialstarget trial emulationstaggered adoptiondifference-in-differencescluster randomized trialscausal inferencehealth policy evaluationquasi-experimental designs

0 comments

The pith

Observational studies with staggered policy adoption can emulate stepped-wedge cluster randomized trials to improve design, reporting, and causal inference.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes that researchers evaluating health policies with observational data on staggered adoption can restructure their studies as emulations of stepped-wedge cluster randomized trials inside the target trial emulation framework. This framing pushes investigators to specify the hypothetical trial features they are mimicking, such as cluster randomization and timing of rollout, along with the core assumptions required for valid emulation. A sympathetic reader would care because the variety of current difference-in-differences methods makes it hard to compare studies or judge whether their evidence is reliable. By borrowing the conceptual and reporting standards from trial emulation, the approach encourages explicit statements of the estimand, consideration of heterogeneity and time-varying effects, and clearer communication across randomized and quasi-experimental traditions.

Core claim

The authors claim that framing observational staggered-adoption studies as emulations of stepped-wedge cluster randomized trials within the target trial emulation framework provides a unified structure for design, analysis, and reporting. This structure highlights policy heterogeneity, time-varying effects, spillovers, and anticipation effects; clarifies the estimand and assumptions; identifies settings unlikely to yield high-quality causal evidence; and guides the bias-variance-generalizability trade-offs that arise from specific design and analysis choices.

What carries the argument

Target trial emulation framework, which restructures observational data on staggered policy adoption to match the randomization, timing, and cluster features of a hypothetical stepped-wedge cluster randomized trial.

If this is right

Studies will report a single, clearly defined estimand and list the assumptions needed for the emulation to be valid.
Analyses will routinely examine treatment effect heterogeneity, time-varying effects, and potential spillovers or anticipation.
Investigators will more often recognize and avoid study designs that cannot support credible causal claims under either randomized or observational approaches.
Insights on power, cluster effects, and crossover designs will flow between trialists and quasi-experimental researchers.
Design choices will be evaluated explicitly for their impact on bias, variance, and the generalizability of results.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same emulation logic could be applied to other staggered rollout settings outside health policy, such as education or environmental interventions.
Software tools that automate the restructuring of longitudinal data into stepped-wedge trial formats would reduce implementation barriers.
Emulation failures could serve as diagnostic signals that prompt collection of additional covariates or different analytic strategies.
Over time, journals might adopt reporting checklists that require explicit mapping from observational data to a target stepped-wedge trial.

Load-bearing premise

Observational data on staggered policy adoption can be validly restructured to emulate the randomization, timing, and cluster features of a stepped-wedge cluster randomized trial while satisfying the core assumptions of target trial emulation.

What would settle it

An actual stepped-wedge cluster randomized trial conducted in the same population and policy context yields materially different effect estimates or different conclusions about effect heterogeneity than the emulated analysis of the corresponding observational data.

Figures

Figures reproduced from arXiv: 2604.12900 by Fan Li, Gregg S. Gonsalves, Guanyu Tong, Haidong Lu, Lee Kennedy-Shaffer.

**Figure 1.** Figure 1: Design schema for alternative unit inclusion criteria (Component 1). MMWR weeks are as defined by the U.S. Centers for Disease Control and Prevention. Shaded boxes indicate postadoption time periods; dates indicate lottery announcement dates for the relevant states [PITH_FULL_IMAGE:figures/full_fig_p012_1.png] view at source ↗

read the original abstract

Both cluster randomized trials and quasi-experimental designs are used to evaluate the impact of health and social policies and interventions. Stepped-wedge cluster randomized trials randomize a staggered adoption approach, while recent difference-in-differences methods allow analysis of non-randomized settings where similar policies are adopted at different time points. These approaches have become common, but the sheer variety of methods for analyzing observational studies with staggered adoption makes it challenging to clearly design and report such studies. We propose that observational and quasi-experimental study investigators can address these challenges by emulating stepped-wedge cluster randomized trials in the target trial emulation framework. The conceptual framework and reporting standards of trial emulation will encourage consideration of key features of these designs, such as policy heterogeneity and time-varying effects, and clear reporting of the estimand and assumptions. It also highlights areas where those interested in randomized trials and quasi-experimental designs can benefit from one another's experience by bringing insights across disciplines. Questions of treatment effect heterogeneity, power, spillovers, and anticipation effects, among others, are common to both fields and can benefit from cross-pollination. This article also demonstrates how trial emulation can identify settings that are not well-served by either approach, thereby avoiding studies unlikely to generate high-quality causal evidence. Finally, it informs the bias-variance-generalizability trade-off that arises with design and analysis choices made in these settings, supporting better evidence generation and interpretation in settings where important questions can be answered.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper frames observational staggered policy adoption studies as emulated stepped-wedge cluster trials inside the target trial emulation approach to sharpen estimands and assumptions, but offers only a high-level conceptual sketch with no examples or derivations.

read the letter

The main takeaway is that the authors want observational researchers studying policies rolled out at different times to restructure their data and thinking to match a stepped-wedge cluster randomized trial design, then apply target trial emulation rules to make the estimand, assumptions, and potential problems explicit. This is presented as a way to borrow clarity from trial methods without actually running a trial. It is not a new statistical technique or theorem; it is a synthesis that points out shared questions like treatment effect heterogeneity, anticipation effects, and spillovers across the randomized and quasi-experimental literatures. The cross-field pollination angle is reasonable and could help some readers organize their reporting. The paper also notes that certain settings will not support good causal evidence under either approach, which is a useful caution. The argument stays internally consistent at the conceptual level and does not introduce circular claims or hidden fitting steps. That said, the manuscript provides no worked example, no simulation study, no real-data application, and no explicit mapping of how the usual target trial emulation assumptions translate to staggered observational data. Without those, it is hard to judge how much practical guidance the framing actually supplies or whether it changes analysis choices in any concrete way. The piece is aimed at health policy evaluators who already know both target trial emulation and difference-in-differences methods and are looking for a structured way to think through design trade-offs. It is not aimed at readers seeking new identification results or ready-to-use code. The paper deserves peer review because the underlying idea is coherent and the call for clearer reporting is worthwhile, even though the current version would need concrete illustrations and assumption checks before it could guide applied work.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes that observational and quasi-experimental studies with staggered policy adoption can be restructured to emulate the design features of stepped-wedge cluster randomized trials (SW-CRTs) inside the target trial emulation (TTE) framework. This is presented as a way to clarify estimands, make assumptions explicit, handle heterogeneity and time-varying effects, improve reporting, identify unsuitable study settings, and inform bias-variance-generalizability trade-offs by cross-pollinating insights from randomized trials and difference-in-differences methods.

Significance. If operationalized, the proposal could raise the quality of causal evidence for health policies by encouraging explicit mapping of observational data to SW-CRT features and by highlighting common challenges such as spillovers, anticipation, and effect heterogeneity. It offers a structured lens for design choices rather than promising automatic identification.

major comments (2)

[Abstract] Abstract: the claim that the article 'demonstrates how trial emulation can identify settings that are not well-served by either approach' is unsupported; no concrete criterion, decision rule, or worked example is supplied for when emulation would be inappropriate, which is load-bearing for the practical utility asserted in the final paragraph.
[Conceptual framework] The central proposal (restructuring observational staggered-adoption data to emulate SW-CRT randomization, timing, and clustering while satisfying TTE assumptions) is stated at a conceptual level without a step-by-step protocol, variable-mapping table, or explicit checklist for verifying the no-anticipation, consistency, and positivity conditions in the emulated design.

minor comments (2)

The manuscript would benefit from a side-by-side table contrasting SW-CRT randomization, TTE emulation steps, and standard staggered DiD assumptions.
Notation for clusters, adoption periods, and the target estimand (e.g., average treatment effect on the treated under the emulated design) should be introduced formally even if the paper remains non-technical.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments, which identify opportunities to strengthen the practical utility of our conceptual proposal. We respond to each major comment below and will incorporate revisions to address the concerns raised.

read point-by-point responses

Referee: [Abstract] Abstract: the claim that the article 'demonstrates how trial emulation can identify settings that are not well-served by either approach' is unsupported; no concrete criterion, decision rule, or worked example is supplied for when emulation would be inappropriate, which is load-bearing for the practical utility asserted in the final paragraph.

Authors: We agree that the abstract claim would be more robust with concrete support. The manuscript discusses conceptually how emulation can flag limitations (e.g., simultaneous adoption violating staggered structure or unaddressable spillovers), but lacks an explicit example or decision rule. In revision, we will add a brief illustrative scenario in the discussion section showing how the framework identifies unsuitable settings, such as when positivity fails due to universal adoption. This will substantiate the claim while preserving the paper's conceptual emphasis. revision: yes
Referee: [Conceptual framework] The central proposal (restructuring observational staggered-adoption data to emulate SW-CRT randomization, timing, and clustering while satisfying TTE assumptions) is stated at a conceptual level without a step-by-step protocol, variable-mapping table, or explicit checklist for verifying the no-anticipation, consistency, and positivity conditions in the emulated design.

Authors: The manuscript is framed as a high-level conceptual bridge between TTE and SW-CRT designs rather than an implementation protocol. We recognize that adding operational elements would improve usability. We will revise by including a variable-mapping table aligning observational elements (cluster IDs, adoption times, outcomes) with SW-CRT features and an expanded checklist for verifying TTE assumptions (no anticipation, consistency, positivity) in the emulated design. This keeps the focus on cross-pollination of ideas while making the proposal more actionable. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected in conceptual proposal

full rationale

The paper advances a methodological proposal to restructure observational data on staggered policy adoption as an emulation of stepped-wedge cluster randomized trials inside the target trial emulation framework. This is presented as a conceptual aid for clarifying estimands, assumptions, heterogeneity, and reporting standards rather than as a mathematical derivation or statistical model with fitted parameters. No equations, self-definitional constructs, or predictions that reduce to inputs by construction appear in the abstract or described structure. The argument draws on established prior frameworks (target trial emulation and stepped-wedge designs) without load-bearing self-citations that would render the central claim tautological. The proposal remains self-contained as a suggestion for improved practice and cross-disciplinary insight, with no reduction of its content to its own inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

This is a conceptual proposal relying on standard causal inference assumptions from target trial emulation and stepped-wedge trial literature. No new free parameters, invented entities, or ad hoc axioms are introduced beyond those already established in the field.

axioms (1)

domain assumption Observational data on staggered policy adoption can be structured to emulate randomized stepped-wedge trial features under target trial emulation assumptions.
This is the core premise invoked throughout the abstract to justify the emulation approach.

pith-pipeline@v0.9.0 · 5575 in / 1308 out tokens · 43014 ms · 2026-05-10T14:26:16.594143+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

78 extracted references · 71 canonical work pages · 1 internal anchor

[1]

Alternative causal inference methods in population health research: Evaluating tradeoffs and triangulating evidence

Matthay EC, Hagan E, Gottlieb LM, Tan ML, Vlahov D, Adler NE, et al. Alternative causal inference methods in population health research: Evaluating tradeoffs and triangulating evidence. SSM - Popul Health. 2020;10:100526. doi:10.1016/j.ssmph.2019.100526

work page doi:10.1016/j.ssmph.2019.100526 2020
[2]

Natural experiments: An overview of methods, approaches, and contributions to public health intervention research

Craig P, Katikireddi SV , Leyland A, Popham F. Natural experiments: An overview of methods, approaches, and contributions to public health intervention research. Annu Rev Public Health. 2017;38(1):39–56. doi:10.1146/annurev-publhealth-031816-044327

work page doi:10.1146/annurev-publhealth-031816-044327 2017
[3]

Estimating the effects of health policy initiatives: Where we are and where we need to go

Localio AR, Guallar E. Estimating the effects of health policy initiatives: Where we are and where we need to go. Ann Intern Med. 2024;177(11):1586–7. doi:10.7326/M24-0896

work page doi:10.7326/m24-0896 2024
[4]

Designing difference-in- difference studies with staggered treatment adoption: Key concepts and practical guidelines

Wing C, Yozwiak M, Hollingsworth A, Freedman S, Simon K. Designing difference-in- difference studies with staggered treatment adoption: Key concepts and practical guidelines. Annu Rev Public Health. 2024;45(1):485–505. doi:10.1146/annurev-publhealth-061022- 050825

work page doi:10.1146/annurev-publhealth-061022- 2024
[5]

Difference-in-differences with variation in treatment timing

Goodman-Bacon A. Difference-in-differences with variation in treatment timing. J Econom. 2021;225(2):254–77. doi:10.1016/j.jeconom.2021.03.014

work page doi:10.1016/j.jeconom.2021.03.014 2021
[6]

Difference-in-differences with multiple time periods

Callaway B, Sant’Anna PHC. Difference-in-differences with multiple time periods. J Econom. 2021;225(2):200–30. doi:10.1016/j.jeconom.2020.12.001

work page doi:10.1016/j.jeconom.2020.12.001 2021
[7]

What’s trending in difference-in-differences? A synthesis of the recent econometrics literature

Roth J, Sant’Anna PHC, Bilinski A, Poe J. What’s trending in difference-in-differences? A synthesis of the recent econometrics literature. J Econom. 2023;235(2):2218–44. doi:10.1016/j.jeconom.2023.03.008

work page doi:10.1016/j.jeconom.2023.03.008 2023
[8]

Advances in difference-in-differences methods for policy evaluation research

Wang G, Hamad R, White JS. Advances in difference-in-differences methods for policy evaluation research. Epidemiology. 2024;35(5):628–37. doi:10.1097/EDE.0000000000001755

work page doi:10.1097/ede.0000000000001755 2024
[9]

Difference‐in‐differences for health policy and practice: A review of modern methods

Feng S, Ganguli I, Lee Y , Poe J, Ryan A, Bilinski A. Difference‐in‐differences for health policy and practice: A review of modern methods. Stat Med. 2025;44(23–24):e70247. doi:10.1002/sim.70247

work page doi:10.1002/sim.70247 2025
[10]

Cluster Randomised Trials

Hayes RJ, Moulton LH. Cluster Randomised Trials. Second Edition. New York: Chapman and Hall/CRC; 2017

2017
[11]

Estimands in cluster-randomized trials: Choosing analyses that answer the right question

Kahan BC, Li F, Copas AJ, Harhay MO. Estimands in cluster-randomized trials: Choosing analyses that answer the right question. Int J Epidemiol. 2023;52(1):107–18. doi:10.1093/ije/dyac131

work page doi:10.1093/ije/dyac131 2023
[12]

Selecting the optimal longitudinal cluster randomized design with a continuous outcome: Parallel-arm, crossover, or stepped-wedge

Liu J, Li F, Sutcliffe S, Colditz GA. Selecting the optimal longitudinal cluster randomized design with a continuous outcome: Parallel-arm, crossover, or stepped-wedge. Stat Methods Med Res. 2025;34(10):2069–90. doi:10.1177/09622802251360409 20

work page doi:10.1177/09622802251360409 2025
[13]

The stepped wedge cluster randomised trial: Rationale, design, analysis, and reporting

Hemming K, Haines TP, Chilton PJ, Girling AJ, Lilford RJ. The stepped wedge cluster randomised trial: Rationale, design, analysis, and reporting. BMJ. 2015;350:h391. doi:10.1136/bmj.h391

work page doi:10.1136/bmj.h391 2015
[14]

Statistical efficiency and optimal design for stepped cluster studies under linear mixed effects models

Girling AJ, Hemming K. Statistical efficiency and optimal design for stepped cluster studies under linear mixed effects models. Stat Med. 2016;35(13):2149–66. doi:10.1002/sim.6850

work page doi:10.1002/sim.6850 2016
[15]

Reflection on modern methods: When is a stepped-wedge cluster randomized trial a good study design choice? Int J Epidemiol

Hemming K, Taljaard M. Reflection on modern methods: When is a stepped-wedge cluster randomized trial a good study design choice? Int J Epidemiol. 2020;49(3):1043–52. doi:10.1093/ije/dyaa077

work page doi:10.1093/ije/dyaa077 2020
[16]

American Journal of Epidemiology , volume=

Hernán MA, Robins JM. Using big data to emulate a target trial when a randomized trial is not available. Am J Epidemiol. 2016;183(8):758–64. doi:10.1093/aje/kwv254

work page doi:10.1093/aje/kwv254 2016
[17]

Target trial emulation: A framework for causal inference from observational data

Hernán MA, Wang W, Leaf DE. Target trial emulation: A framework for causal inference from observational data. JAMA. 2022;328(24):2446. doi:10.1001/jama.2022.21383

work page doi:10.1001/jama.2022.21383 2022
[18]

Target trial emulation

Hubbard RA, Gatsonis CA, Hogan JW, Hunter DJ, Normand SLT, Troxel AB. “Target trial emulation” for observational studies — Potential and pitfalls. N Engl J Med. 2024;391(21):1975–7. doi:10.1056/NEJMp2407586

work page doi:10.1056/nejmp2407586 2024
[19]

Transparent Reporting of Observational Studies Emulating a Target Trial—The TARGET Statement.JAMA2025;334(12):1084–1093

Cashin AG, Hansford HJ, Hernán MA, Swanson SA, Lee H, Jones MD, et al. Transparent reporting of observational studies emulating a target trial—The TARGET Statement. JAMA. 2025;334(12):1084. doi:10.1001/jama.2025.13350

work page doi:10.1001/jama.2025.13350 2025
[20]

Four targets: An enhanced framework for guiding causal inference from observational data

Lu H, Li F, Lesko CR, Fink DS, Rudolph KE, Harhay MO, et al. Four targets: An enhanced framework for guiding causal inference from observational data. Int J Epidemiol. 2025;54(1):dyaf003. doi:10.1093/ije/dyaf003

work page doi:10.1093/ije/dyaf003 2025
[21]

A trial emulation approach for policy evaluations with group-level longitudinal data

Ben-Michael E, Feller A, Stuart EA. A trial emulation approach for policy evaluations with group-level longitudinal data. Epidemiology. 2021;32(4):533–40. doi:10.1097/EDE.0000000000001369

work page doi:10.1097/ede.0000000000001369 2021
[22]

Target trial emulation for evaluating health policy

Seewald NJ, McGinty EE, Stuart EA. Target trial emulation for evaluating health policy. Ann Intern Med. 2024;177(11):1530–8. doi:10.7326/M23-2440

work page doi:10.7326/m23-2440 2024
[23]

Emulating target trials of postexposure vaccines using observational data

Boyer C, Lipsitch M. Emulating target trials of postexposure vaccines using observational data. Am J Epidemiol. 2025;194(7):2037–46. doi:10.1093/aje/kwae350

work page doi:10.1093/aje/kwae350 2025
[24]

Specifying a target trial prevents immortal time bias and other self-inflicted injuries in observational analyses

Hernán MA, Sauer BC, Hernández-Díaz S, Platt R, Shrier I. Specifying a target trial prevents immortal time bias and other self-inflicted injuries in observational analyses. J Clin Epidemiol. 2016;79:70–5. doi:10.1016/j.jclinepi.2016.04.014

work page doi:10.1016/j.jclinepi.2016.04.014 2016
[25]

Transparency and rigor: Target trial emulation aims to achieve both

De Stavola BL, Gomes M, Katsoulis M. Transparency and rigor: Target trial emulation aims to achieve both. Epidemiology. 2023;34(5):624–6. doi:10.1097/EDE.0000000000001638 21

work page doi:10.1097/ede.0000000000001638 2023
[26]

The target trial framework for causal inference from observational data: Why and when is it helpful? Ann Intern Med

Hernán MA, Dahabreh IJ, Dickerman BA, Swanson SA. The target trial framework for causal inference from observational data: Why and when is it helpful? Ann Intern Med. 2025;178(3):402–7. doi:10.7326/ANNALS-24-01871

work page doi:10.7326/annals-24-01871 2025
[27]

Journal of the Royal Statistical Society: Series A (Statistics in Society) , author =

Imai K, King G, Stuart EA. Misunderstandings between experimentalists and observationalists about causal inference. J R Stat Soc Ser A Stat Soc. 2008;171(2):481–502. doi:10.1111/j.1467-985X.2007.00527.x

work page doi:10.1111/j.1467-985x.2007.00527.x 2008
[28]

Observational data for comparative effectiveness research: An emulation of randomised trials of statins and primary prevention of coronary heart disease

Danaei G, Rodríguez LAG, Cantero OF, Logan R, Hernán MA. Observational data for comparative effectiveness research: An emulation of randomised trials of statins and primary prevention of coronary heart disease. Stat Methods Med Res. 2013;22(1):70–96. doi:10.1177/0962280211403603

work page doi:10.1177/0962280211403603 2013
[29]

Language Model Cascades: Token-Level Uncertainty and Beyond

Kennedy-Shaffer L. A generalized difference-in-differences estimator for randomized stepped-wedge and observational staggered adoption settings [Preprint]. arXiv; 2024. Available from: https://arxiv.org/abs/2405.08730 doi:10.48550/ARXIV .2405.08730

work page internal anchor Pith review doi:10.48550/arxiv 2024
[30]

Consort 2010 statement: Extension to cluster randomised trials

Campbell MK, Piaggio G, Elbourne DR, Altman DG, for the CONSORT Group. Consort 2010 statement: Extension to cluster randomised trials. BMJ. 2012;345:e5661. doi:10.1136/bmj.e5661

work page doi:10.1136/bmj.e5661 2010
[31]

Reporting of stepped wedge cluster randomised trials: Extension of the CONSORT 2010 statement with explanation and elaboration

Hemming K, Taljaard M, McKenzie JE, Hooper R, Copas A, Thompson JA, et al. Reporting of stepped wedge cluster randomised trials: Extension of the CONSORT 2010 statement with explanation and elaboration. BMJ. 2018;363:k1614. doi:10.1136/bmj.k1614

work page doi:10.1136/bmj.k1614 2010
[32]

A scoping review identified additional considerations for defining estimands in cluster randomized trials

Bi D, Copas A, Li F, Kahan BC. A scoping review identified additional considerations for defining estimands in cluster randomized trials. J Clin Epidemiol. 2026;189:112015. doi:10.1016/j.jclinepi.2025.112015

work page doi:10.1016/j.jclinepi.2025.112015 2026
[33]

Causal inference under multiple versions of treatment

VanderWeele TJ, Hernan MA. Causal inference under multiple versions of treatment. J Causal Inference. 2013;1(1):1–20. doi:10.1515/jci-2012-0002

work page doi:10.1515/jci-2012-0002 2013
[34]

Designing a stepped wedge trial: Three main designs, carry-over effects and randomisation approaches

Copas AJ, Lewis JJ, Thompson JA, Davey C, Baio G, Hargreaves JR. Designing a stepped wedge trial: Three main designs, carry-over effects and randomisation approaches. Trials. 2015;16(1):352. doi:10.1186/s13063-015-0842-7

work page doi:10.1186/s13063-015-0842-7 2015
[35]

Information content of stepped‐wedge designs when treatment effect heterogeneity and/or implementation periods are present

Kasza J, Taljaard M, Forbes AB. Information content of stepped‐wedge designs when treatment effect heterogeneity and/or implementation periods are present. Stat Med. 2019;38(23):4686–701. doi:10.1002/sim.8327

work page doi:10.1002/sim.8327 2019
[36]

Informative cluster size in cluster- randomised trials: A case study from the TRIGGER trial

Kahan BC, Li F, Blette B, Jairath V , Copas A, Harhay M. Informative cluster size in cluster- randomised trials: A case study from the TRIGGER trial. Clin Trials. 2023;20(6):661–9. doi:10.1177/17407745231186094

work page doi:10.1177/17407745231186094 2023
[37]

Model-robust standardization in cluster- randomized trials

Li F, Tong J, Fang X, Cheng C, Kahan BC, Wang B. Model-robust standardization in cluster- randomized trials. Stat Med. 2025;44(20–22):e70270. doi:10.1002/sim.70270 22

work page doi:10.1002/sim.70270 2025
[38]

Analysis of stepped wedge cluster randomized trials in the presence of a time‐varying treatment effect

Kenny A, V oldal EC, Xia F, Heagerty PJ, Hughes JP. Analysis of stepped wedge cluster randomized trials in the presence of a time‐varying treatment effect. Stat Med. 2022;41(22):4311–39. doi:10.1002/sim.9511

work page doi:10.1002/sim.9511 2022
[39]

Mixed effects approach to the analysis of the stepped wedge cluster randomised trial—Investigating the confounding effect of time through simulation

Nickless A, V oysey M, Geddes J, Yu LM, Fanshawe TR. Mixed effects approach to the analysis of the stepped wedge cluster randomised trial—Investigating the confounding effect of time through simulation. PLOS ONE. 2018;13(12):e0208876. doi:10.1371/journal.pone.0208876

work page doi:10.1371/journal.pone.0208876 2018
[40]

How to achieve model-robust inference in stepped wedge trials with model-based methods? Biometrics

Wang B, Wang X, Li F. How to achieve model-robust inference in stepped wedge trials with model-based methods? Biometrics. 2024;80(4):ujae123. doi:10.1093/biomtc/ujae123

work page doi:10.1093/biomtc/ujae123 2024
[41]

Sample size calculation for stepped wedge and other longitudinal cluster randomised trials

Hooper R, Teerenstra S, De Hoop E, Eldridge S. Sample size calculation for stepped wedge and other longitudinal cluster randomised trials. Stat Med. 2016;35(26):4718–28. doi:10.1002/sim.7028

work page doi:10.1002/sim.7028 2016
[42]

Guidelines for the content of statistical analysis plans in clinical trials: Protocol for an extension to cluster randomized trials

Hemming K, Thompson JY , Hooper RL, Ukoumunne OC, Li F, Caille A, et al. Guidelines for the content of statistical analysis plans in clinical trials: Protocol for an extension to cluster randomized trials. Trials. 2025;26(1):72. doi:10.1186/s13063-025-08756-3

work page doi:10.1186/s13063-025-08756-3 2025
[43]

Assessing the effectiveness of COVID-19 vaccine lotteries: A cross-state synthetic control methods approach

Fuller S, Kazemian S, Algara C, Simmons DJ. Assessing the effectiveness of COVID-19 vaccine lotteries: A cross-state synthetic control methods approach. Pereira T, editor. PLOS ONE. 2022;17(9):e0274374. doi:10.1371/journal.pone.0274374

work page doi:10.1371/journal.pone.0274374 2022
[44]

The Ohio vaccine lottery and starting vaccination rates

Brehm ME, Brehm PA, Saavedra M. The Ohio vaccine lottery and starting vaccination rates. Am J Health Econ. 2022 Jun 1;8(3):387–411. doi:10.1086/718512

work page doi:10.1086/718512 2022
[45]

Did Ohio’s vaccine lottery increase vaccination rates? A pre-registered, synthetic control study

Lang D, Esbenshade L, Willer R. Did Ohio’s vaccine lottery increase vaccination rates? A pre-registered, synthetic control study. J Exp Polit Sci. 2023;10(2):242–60. doi:10.1017/XPS.2021.32

work page doi:10.1017/xps.2021.32 2023
[46]

Weeks ending log 2020–2021 [MMWR weeks] [Internet]

Morbidity and Mortality Weekly Report. Weeks ending log 2020–2021 [MMWR weeks] [Internet]. U.S. Centers for Disease Control and Prevention; 2019 Sep [cited 2026 Apr 2]. Available from: https://ndc.services.cdc.gov/wp-content/uploads/W2021-22.pdf

2020
[47]

Quasi-experimental methods for pharmacoepidemiology: difference-in- differences and synthetic control methods with case studies for vaccine evaluation

Kennedy-Shaffer L. Quasi-experimental methods for pharmacoepidemiology: difference-in- differences and synthetic control methods with case studies for vaccine evaluation. Am J Epidemiol. 2024;193(7):1050–8. doi:10.1093/aje/kwae019

work page doi:10.1093/aje/kwae019 2024
[48]

Design and analysis of stepped wedge cluster randomized trials

Hussey MA, Hughes JP. Design and analysis of stepped wedge cluster randomized trials. Contemp Clin Trials. 2007;28(2):182–91. doi:10.1016/j.cct.2006.05.007

work page doi:10.1016/j.cct.2006.05.007 2007
[49]

Monetary incentives increase COVID-19 vaccinations

Campos-Mercade P, Meier AN, Schneider FH, Meier S, Pope D, Wengström E. Monetary incentives increase COVID-19 vaccinations. Science. 2021;374(6569):879–82. doi:10.1126/science.abm0475 23

work page doi:10.1126/science.abm0475 2021
[50]

swdpwr: A SAS macro and an R package for power calculations in stepped wedge cluster randomized trials

Chen J, Zhou X, Li F, Spiegelman D. swdpwr: A SAS macro and an R package for power calculations in stepped wedge cluster randomized trials. Comput Methods Programs Biomed. 2022;213:106522. doi:10.1016/j.cmpb.2021.106522

work page doi:10.1016/j.cmpb.2021.106522 2022
[51]

Using synthetic controls: Feasibility, data requirements, and methodological aspects

Abadie A. Using synthetic controls: Feasibility, data requirements, and methodological aspects. J Econ Lit. 2021;59(2):391–425. doi:10.1257/jel.20191450

work page doi:10.1257/jel.20191450 2021
[52]

NASHP State Tracker [Internet]

National Academy for State Health Policy. NASHP State Tracker [Internet]. 2025 [cited 2025 Oct 28]. State efforts to limit or enforce COVID-19 vaccine mandates. Available from: https://nashp.org/state-tracker/state-efforts-to-ban-or-enforce-covid-19-vaccine-mandates- and-passports/

2025
[53]

US state vaccine mandates did not influence COVID-19 vaccination rates but reduced uptake of COVID-19 boosters and flu vaccines compared to bans on vaccine restrictions

Rains SA, Richards AS. US state vaccine mandates did not influence COVID-19 vaccination rates but reduced uptake of COVID-19 boosters and flu vaccines compared to bans on vaccine restrictions. Proc Natl Acad Sci. 2024;121(8):e2313610121. doi:10.1073/pnas.2313610121

work page doi:10.1073/pnas.2313610121 2024
[54]

Vaccination mandates and their alternatives and complements

Schmid P, Böhm R, Das E, Holford D, Korn L, Leask J, et al. Vaccination mandates and their alternatives and complements. Nat Rev Psychol. 2024;3(12):789–803. doi:10.1038/s44159- 024-00381-2

work page doi:10.1038/s44159- 2024
[55]

US states that mandated COVID-19 vaccination see higher, not lower, take-up of COVID-19 boosters and flu vaccines

Fitzgerald J. US states that mandated COVID-19 vaccination see higher, not lower, take-up of COVID-19 boosters and flu vaccines. Proc Natl Acad Sci. 2024;121(41):e2403758121. doi:10.1073/pnas.2403758121

work page doi:10.1073/pnas.2403758121 2024
[56]

Information content of cluster–period cells in stepped wedge trials

Kasza J, Forbes AB. Information content of cluster–period cells in stepped wedge trials. Biometrics. 2019;75(1):144–52. doi:10.1111/biom.12959

work page doi:10.1111/biom.12959 2019
[57]

Heterogeneous treatment effects and bias in the analysis of the stepped wedge design

Lindner S, McConnell KJ. Heterogeneous treatment effects and bias in the analysis of the stepped wedge design. Health Serv Outcomes Res Methodol. 2021;21(4):419–38. doi:10.1007/s10742-021-00244-w

work page doi:10.1007/s10742-021-00244-w 2021
[58]

M., Turner, E

Lee KM, Turner EL, Kenny A. Analysis of stepped‐wedge cluster randomized trials when treatment effects vary by exposure time or calendar time. Stat Med. 2025;44(20–22):e70256. doi:10.1002/sim.70256

work page doi:10.1002/sim.70256 2025
[59]

Key considerations for designing, conducting and analysing a cluster randomized trial

Hemming K, Taljaard M. Key considerations for designing, conducting and analysing a cluster randomized trial. Int J Epidemiol. 2023;52(5):1648–58. doi:10.1093/ije/dyad064

work page doi:10.1093/ije/dyad064 2023
[60]

A review of current practice in the design and analysis of extremely small stepped-wedge cluster randomized trials

Tong G, Nevins P, Ryan M, Davis-Plourde K, Ouyang Y , Pereira Macedo JA, et al. A review of current practice in the design and analysis of extremely small stepped-wedge cluster randomized trials. Clin Trials. 2025 Feb;22(1):45–56. doi:10.1177/17407745241276137

work page doi:10.1177/17407745241276137 2025
[61]

Sample size calculators for planning stepped-wedge cluster randomized trials: A review and comparison

Ouyang Y , Li F, Preisser JS, Taljaard M. Sample size calculators for planning stepped-wedge cluster randomized trials: A review and comparison. Int J Epidemiol. 2022;51(6):2000–13. doi:10.1093/ije/dyac123 24

work page doi:10.1093/ije/dyac123 2022
[62]

Practical considerations for sample size calculation for cluster randomized trials

Leyrat C, Eldridge S, Taljaard M, Hemming K. Practical considerations for sample size calculation for cluster randomized trials. J Epidemiol Popul Health. 2024;72(1):202198. doi:10.1016/j.jeph.2024.202198

work page doi:10.1016/j.jeph.2024.202198 2024
[63]

Novel methods for the analysis of stepped wedge cluster randomized trials

Kennedy-Shaffer L, De Gruttola V , Lipsitch M. Novel methods for the analysis of stepped wedge cluster randomized trials. Stat Med. 2020;39(7):815–44. doi:10.1002/sim.8451

work page doi:10.1002/sim.8451 2020
[64]

Robust analysis of stepped wedge trials using composite likelihood models

V oldal EC, Kenny A, Xia F, Heagerty P, Hughes JP. Robust analysis of stepped wedge trials using composite likelihood models. Stat Med. 2024;43(17):3326–52. doi:10.1002/sim.10120

work page doi:10.1002/sim.10120 2024
[65]

Cluster randomized trial designs for modeling time‐varying intervention effects

Lee KM, Cheung YB. Cluster randomized trial designs for modeling time‐varying intervention effects. Stat Med. 2024;43(1):49–60. doi:10.1002/sim.9941

work page doi:10.1002/sim.9941 2024
[66]

Assessing exposure-time treatment effect heterogeneity in stepped-wedge cluster randomized trials

Maleyeff L, Li F, Haneuse S, Wang R. Assessing exposure-time treatment effect heterogeneity in stepped-wedge cluster randomized trials. Biometrics. 2023;79(3):2551–64. doi:10.1111/biom.13803

work page doi:10.1111/biom.13803 2023
[67]

Planning stepped wedge cluster randomized trials to detect treatment effect heterogeneity

Li F, Chen X, Tian Z, Wang R, Heagerty PJ. Planning stepped wedge cluster randomized trials to detect treatment effect heterogeneity. Stat Med. 2024;43(5):890–911. doi:10.1002/sim.9990

work page doi:10.1002/sim.9990 2024
[68]

Use of the stepped wedge design cannot be recommended: A critical appraisal and comparison with the classic cluster randomized controlled trial design

Kotz D, Spigt M, Arts ICW, Crutzen R, Viechtbauer W. Use of the stepped wedge design cannot be recommended: A critical appraisal and comparison with the classic cluster randomized controlled trial design. J Clin Epidemiol. 2012;65(12):1249–52. doi:10.1016/j.jclinepi.2012.06.004

work page doi:10.1016/j.jclinepi.2012.06.004 2012
[69]

Policy effect evaluation under counterfactual neighbourhood intervention in the presence of spillover

Lee Y , Hettinger G, Mitra N. Policy effect evaluation under counterfactual neighbourhood intervention in the presence of spillover. J R Stat Soc Ser A Stat Soc. 2026;189(1):392–411. doi:10.1093/jrsssa/qnae153

work page doi:10.1093/jrsssa/qnae153 2026
[70]

Two-way fixed effects estimators with heterogeneous treatment effects

de Chaisemartin C, D’Haultfœuille X. Two-way fixed effects estimators with heterogeneous treatment effects. Am Econ Rev. 2020;110(9):2964–96. doi:10.1257/aer.20181169

work page doi:10.1257/aer.20181169 2020
[71]

Are Target Trial Emulations the Gold Standard for Observational Studies? Epidemiology

Pearce N, Vandenbroucke JP. Are target trial emulations the gold standard for observational studies? Epidemiology. 2023;34(5):614–8. doi:10.1097/EDE.0000000000001636

work page doi:10.1097/ede.0000000000001636 2023
[72]

Emulating randomized trials: Treading carefully and pushing the limits

Renoux C, Suissa S. Emulating randomized trials: Treading carefully and pushing the limits. Am J Epidemiol. 2025;194(5):1460–1. doi:10.1093/aje/kwaf023

work page doi:10.1093/aje/kwaf023 2025
[73]

Invited Commentary: Conducting and emulating trials to study effects of social interventions

Rojas-Saunero LP, Labrecque JA, Swanson SA. Invited Commentary: Conducting and emulating trials to study effects of social interventions. Am J Epidemiol. 2022;191(8):1453–

2022
[74]

doi:10.1093/aje/kwac066

work page doi:10.1093/aje/kwac066
[75]

The staircase cluster randomised trial design: A pragmatic alternative to the stepped wedge

Grantham KL, Forbes AB, Hooper R, Kasza J. The staircase cluster randomised trial design: A pragmatic alternative to the stepped wedge. Stat Methods Med Res. 2024;33(1):24–41. doi:10.1177/09622802231202364 25 Appendix 1 We generate example power calculations for three possible study designs based on the target trial emulation of the vaccine lottery polici...

work page doi:10.1177/09622802231202364 2024
[76]

Use all states in the CDC-defined Midwest region (12 total states), with all observations from CDC MMWR Weeks 15–30 of 2021

2021
[77]

Use the four intervention states in the CDC-defined Midwest region (Ohio, Illinois, Michigan, Missouri) and a matched-comparison state for each, with all observations from CDC MMWR Weeks 15–30 of 2021

2021
[78]

The schematics for these designs are shown in Figure 1 of the main text

Use only the four intervention states in the CDC-defined Midwest region, with all observations from CDC MMWR Weeks 15–30 of 2021. The schematics for these designs are shown in Figure 1 of the main text. Note that these are only examples of designs that could be considered to illustrate how stepped-wedge trial power calculations can be used in the target t...

2021