pith. machine review for the scientific record. sign in

arxiv: 2605.03282 · v1 · submitted 2026-05-05 · 📊 stat.ME

Recognition: unknown

Externally Controlled Trials: A Review of Design and Borrowing Through a Causal Lens

Authors on Pith no claims yet

Pith reviewed 2026-05-07 14:41 UTC · model grok-4.3

classification 📊 stat.ME
keywords externally controlled trialscausal inferenceBayesian borrowinghybrid trial designsreal-world datasensitivity analysissingle-arm trialscovariate shift
0
0 comments X

The pith

A six-step causal roadmap organizes methods for externally controlled trials to clarify estimands, assumptions, and borrowing strategies.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper reviews externally controlled trials where randomized controls are not feasible, such as in rare diseases or oncology. It organizes the scattered literature on causal inference, Bayesian borrowing, and hybrid designs into a single six-step scientific roadmap. This roadmap distinguishes single-arm trials that compare to external controls from hybrid trials that augment internal controls with external data. The approach shows how statistical methods arise from causal identification and how they balance efficiency against robustness when covariate shift or outcome drift occurs. A reader would care because the framework aims to support more reliable integration of real-world or historical data into clinical and regulatory decisions.

Core claim

By framing ECT methodology through a causal lens, the review establishes that a six-step scientific roadmap can organize the field for both single-arm trials evaluating efficacy against external controls and hybrid controlled trials that augment internal controls with external data from real-world sources or historical studies, while clarifying causal estimands, identifiability assumptions, the derivation of statistical parameters, and the efficiency-robustness trade-offs of modeling and borrowing approaches under covariate shift and outcome drift.

What carries the argument

The six-step scientific roadmap that structures ECT methods by first defining causal estimands and identifiability assumptions before deriving statistical parameters and evaluating borrowing strategies.

If this is right

  • Regulatory and clinical decisions can integrate external data more systematically by following explicit causal estimands and sensitivity analyses.
  • Borrowing strategies in hybrid trials can be chosen based on explicit trade-offs between efficiency and robustness under covariate shift.
  • Both Bayesian dynamic borrowing and frequentist methods can be compared within the same roadmap for operating characteristics and software availability.
  • Sensitivity analysis becomes a required step to assess the impact of outcome drift in single-arm and hybrid settings.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The roadmap could be extended to post-approval effectiveness studies by adding steps that handle time-varying external data sources.
  • Future work might test the roadmap against real trial applications to quantify how often identifiability assumptions hold in oncology or pediatric settings.
  • Connecting the causal steps to specific software tools could reduce the gap between methodological guidance and practical implementation.

Load-bearing premise

The rapidly expanding but fragmented literature on ECTs can be accurately synthesized into one coherent six-step causal roadmap without material omissions or selection bias in the methods reviewed.

What would settle it

A documented ECT method or recent development that cannot be placed within the six-step roadmap while preserving the stated causal estimands and identifiability assumptions would falsify the claimed coherence of the synthesis.

Figures

Figures reproduced from arXiv: 2605.03282 by Birol Emir, Hana Lee, Herbert Pang, Ke Zhu, Lei Nie, Mark van der Laan, Pallavi Mishra-Kalyani, Peng Yang, Rima Izem, Shu Yang, Ying Yuan.

Figure 1
Figure 1. Figure 1: Two ECT designs and scientific roadmap. Gray assignment ratios are illustrative view at source ↗
Figure 2
Figure 2. Figure 2: Illustration of covariate shift and outcome drift. view at source ↗
read the original abstract

Externally controlled trials (ECTs) are increasingly used when randomized controls are infeasible, unethical, or insufficient, including applications in rare diseases, oncology, pediatrics, and post-approval effectiveness research. Although methodological work has expanded rapidly across causal inference, Bayesian dynamic borrowing, and hybrid trial designs, the literature remains fragmented. We adopt a six-step scientific roadmap to organize modern ECT methodology in two primary settings: (i) single-arm trials that evaluate efficacy through comparison with external controls, and (ii) hybrid controlled trials that augment the internal control arm with external controls drawn from real-world data or historical studies. The roadmap clarifies causal estimands, identifiability assumptions, and how statistical parameters arise from identification, and shows how modeling and borrowing strategies trade off efficiency and robustness, especially under covariate shift and outcome drift. Within this framework, we synthesize and evaluate recent Bayesian and frequentist developments, compare their strengths, limitations, operating characteristics, and available software, and emphasize the role of sensitivity analysis. By re-framing ECT methodology through a causal lens, this work establishes a coherent foundation for integrating external data into regulatory and clinical decision-making and highlights core challenges and opportunities for future research.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper reviews externally controlled trials (ECTs) by adopting a causal inference lens to organize the fragmented literature. It proposes a six-step scientific roadmap for two settings: single-arm trials evaluating efficacy via external controls and hybrid trials augmenting internal controls with external real-world or historical data. The roadmap addresses causal estimands, identifiability assumptions, derivation of statistical parameters from identification, modeling and borrowing strategies (Bayesian dynamic borrowing and frequentist), efficiency-robustness trade-offs under covariate shift and outcome drift, comparisons of strengths/limitations/operating characteristics/software, and the role of sensitivity analysis. The central aim is to establish a coherent foundation for integrating external data into regulatory and clinical decision-making while highlighting future challenges.

Significance. If the synthesis holds, this work could meaningfully unify developments across causal inference, Bayesian borrowing, and hybrid designs, providing a structured framework that promotes transparency and rigor in ECT applications, especially in rare diseases, oncology, and pediatrics. The causal reframing, emphasis on identifiability, sensitivity analyses, and explicit comparison of Bayesian versus frequentist operating characteristics represent strengths that could guide practitioners and regulators. The roadmap format offers a practical tool for future research on robustness under realistic violations like outcome drift.

major comments (2)
  1. [Six-step causal roadmap] Six-step causal roadmap (as described in the abstract and detailed in the main sections): the central claim of a comprehensive, coherent synthesis requires explicit documentation of literature search strategy, inclusion/exclusion criteria, and coverage of identifiability conditions under outcome drift and covariate shift. Without these, the roadmap risks material omissions that undermine the foundation for regulatory integration.
  2. [Borrowing strategies and operating characteristics] Section on borrowing strategies and operating characteristics: the evaluation of Bayesian versus frequentist approaches should include specific quantitative comparisons (e.g., bias, coverage, or power under drift scenarios) tied to the roadmap steps, as these are load-bearing for the claimed trade-off insights and software recommendations.
minor comments (2)
  1. [Abstract] Abstract: while the roadmap is outlined, briefly enumerating the six steps would enhance immediate clarity for readers scanning the paper.
  2. [Software and implementation] References and software: ensure all cited software tools and packages include direct links or DOIs to support reproducibility claims.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their detailed and constructive review, which highlights opportunities to strengthen the transparency and practical utility of our synthesis. We address each major comment below and outline the revisions we will make to the manuscript.

read point-by-point responses
  1. Referee: [Six-step causal roadmap] Six-step causal roadmap (as described in the abstract and detailed in the main sections): the central claim of a comprehensive, coherent synthesis requires explicit documentation of literature search strategy, inclusion/exclusion criteria, and coverage of identifiability conditions under outcome drift and covariate shift. Without these, the roadmap risks material omissions that undermine the foundation for regulatory integration.

    Authors: We agree that greater transparency regarding literature selection will improve the manuscript. Although the paper is a targeted synthesis organized by the causal roadmap rather than a formal systematic review, we will add a new subsection (likely in the Introduction) that explicitly describes the scope of the literature considered, the primary search terms and databases used, and the inclusion criteria centered on methodological contributions to causal estimands, identifiability, borrowing, and sensitivity analysis in ECTs. On identifiability under outcome drift and covariate shift, Sections 3–4 and Step 6 already derive the relevant assumptions and discuss sensitivity to these violations; we will expand these passages with additional explicit statements and citations to recent work on drift-robust identification, ensuring the roadmap steps directly reference how such violations are diagnosed and mitigated. revision: yes

  2. Referee: [Borrowing strategies and operating characteristics] Section on borrowing strategies and operating characteristics: the evaluation of Bayesian versus frequentist approaches should include specific quantitative comparisons (e.g., bias, coverage, or power under drift scenarios) tied to the roadmap steps, as these are load-bearing for the claimed trade-off insights and software recommendations.

    Authors: We acknowledge that while the current text reviews the conceptual trade-offs and cites studies reporting operating characteristics, it does not extract or tabulate specific numerical results (bias, coverage, power) under drift. We will revise the borrowing-strategies section to include concise summaries of quantitative findings from key referenced simulation studies, explicitly mapping each result to the relevant roadmap step (e.g., Step 4 identifiability and Step 5 modeling). This will be presented in a new table or expanded prose that contrasts Bayesian dynamic borrowing and frequentist methods under covariate shift and outcome drift, thereby grounding the efficiency-robustness discussion and software recommendations in concrete evidence from the literature without requiring new simulations. revision: partial

Circularity Check

0 steps flagged

Review paper with no internal derivations or predictions

full rationale

This is a review paper that synthesizes and organizes existing literature on externally controlled trials via a causal lens and a six-step roadmap. It references prior work across causal inference, Bayesian borrowing, and hybrid designs without performing any new derivations, predictions, or parameter fittings that could reduce to the paper's own inputs by construction. The central claim of establishing a coherent foundation rests on accurate literature synthesis rather than self-referential steps, and no equations or claims exhibit the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This is a review paper; no new free parameters, axioms, or invented entities are introduced by the authors. The work relies on standard causal inference concepts and assumptions drawn from the cited literature.

pith-pipeline@v0.9.0 · 5542 in / 1069 out tokens · 65547 ms · 2026-05-07T14:41:22.077881+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

235 extracted references · 38 canonical work pages

  1. [1]

    M., Chang, X., Jiang, X., Liu, Q., Mo, M., Xia, H

    Alt, E. M., Chang, X., Jiang, X., Liu, Q., Mo, M., Xia, H. A., and Ibrahim, J. G. (2024). LEAP: The latent exchangeability prior for borrowing information from historical data. Biometrics, 80(3):ujae083

  2. [1]

    M., Chang, X., Jiang, X., Liu, Q., Mo, M., Xia, H

    Alt, E. M., Chang, X., Jiang, X., Liu, Q., Mo, M., Xia, H. A., and Ibrahim, J. G. (2024). LEAP: The latent exchangeability prior for borrowing information from historical data. Biometrics, 80(3):ujae083. Alt, E. M., Chang, X., Liu, Q., Jiang, X., Mo, M., Xia, H. A., and Ibrahim, J. G. (2025). Control arm augmentation and hierarchical modeling in time-to-e...

  3. [2]

    M., Chang, X., Liu, Q., Jiang, X., Mo, M., Xia, H

    Alt, E. M., Chang, X., Liu, Q., Jiang, X., Mo, M., Xia, H. A., and Ibrahim, J. G. (2025). Control arm augmentation and hierarchical modeling in time-to-event trials: advantages and pitfalls.Biostatistics, 26(1):kxaf021

  4. [2]

    Bi, D., Zhou, T., Zhong, W., and Ji, Y. (2026). SAM-HC: a bayesian nonparametric con- struction of hybrid control for randomized clinical trials using external data.Biostatistics, 27(1):kxag003. Bind, M.-A. C. and Rubin, D. B. (2020). When possible, report a Fisher-exactPvalue and display its underlying null randomization distribution.Proceedings of the N...

  5. [3]

    Aronow, P., Chang, H., and Lopatto, P. (2026). Randomization-based confidence sets for the local average treatment effect.Biometrika, page asag010

  6. [3]

    EMA (2025)

    Cambridge University Press. EMA (2025). Workshop on the use of external controls in evidence generation for regulatory decision-making. Fang, Y., Mishra-Kalyani, P., Zhang, X., Gruber, S., Yang, S., Ding, P., Shan, M., Lee, J.-Y., van der Laan, M., Faries, D., et al. (2025). Sensitivity analysis for unmeasured 46 confounding in medical product development...

  7. [4]

    M., Robins, J

    Aronow, P. M., Robins, J. M., Saarinen, T., Sävje, F., and Sekhon, J. S. (2025). Nonpara- metric identification is not enough, but randomized controlled trials are.Observational studies, 11(1):3

  8. [4]

    H., and Dahabreh, I

    Karlsson, R., Wang, G., Krijthe, J. H., and Dahabreh, I. J. (2025). Robust integration of external control data in randomized trials.Biometrics, in press. Kojima, M., Orihara, S., Hanada, K., and Ohigashi, T. (2026). Sample size re-estimation in blinded hybrid-control design using inverse probability weighting.Statistics in Medicine, 45(3-5):e70429. Kopp-...

  9. [5]

    Austin, P. C. (2011). An introduction to propensity score methods for reducing the effects of confounding in observational studies.Multivariate Behavioral Research, 46(3):399–424

  10. [5]

    Li, R., Lin, R., Huang, J., Tian, L., and Zhu, J. (2023b). A frequentist approach to dynamic borrowing.Biometrical Journal, 65(7):2100406. Li, T., Shi, C., Wen, Q., Sui, Y., Qin, Y., Lai, C., and Zhu, H. (2024). Combining experi- mental and historical data for policy evaluation. InProceedings of the 41st International Conference on Machine Learning, volume

  11. [6]

    S., Shao, J., Liu, J., Du, Y., Yi, Y., and Ye, T

    Bannick, M. S., Shao, J., Liu, J., Du, Y., Yi, Y., and Ye, T. (2025). A general form of covariate adjustment in clinical trials under covariate-adaptive randomization.Biometrika, 112(3):asaf029

  12. [6]

    Li, W., Liu, F., and Snavely, D. (2020). Revisit of test-then-pool methods and some practical considerations.Pharmaceutical Statistics, 19(5):498–517. Li, X., Miao, W., Lu, F., and Zhou, X.-H. (2023c). Improving efficiency of inference in clinical trials with external control data.Biometrics, 79(1):394–403. Liao, L. D., Højbjerre-Frandsen, E., Hubbard, A....

  13. [7]

    Bennett, M., White, S., Best, N., and Mander, A. (2021). A novel equivalence probability weighted power prior for using historical control data in an adaptive clinical trial design: A comparison to standard methods.Pharmaceutical statistics, 20(3):462–484

  14. [7]

    nonparametric identification is not enough, but randomized controlled trials are

    Lin, J., Gamalo-Siebers, M., and Tiwari, R. (2019). Propensity-score-based priors for bayesian augmented control design.Pharmaceutical Statistics, 18(2):223–238. Lin, J. and Lin, J. (2022). Incorporating propensity scores for evidence synthesis un- der bayesian framework: review and recommendations for clinical studies.Journal of biopharmaceutical statist...

  15. [8]

    M., Broglio, K

    Berry, S. M., Broglio, K. R., Groshen, S., and Berry, D. A. (2013). Bayesian hierarchical modeling of patient subpopulations: efficient designs of phase ii oncology clinical trials. Clinical Trials, 10(5):720–734

  16. [8]

    Rosenbaum, P

    Springer. Rosenbaum, P. R. and Rubin, D. B. (1983). The central role of the propensity score in observational studies for causal effects.Biometrika, 70(1):41–55. Rosenbaum, P. R. and Rubin, D. B. (1984). Reducing bias in observational studies using subclassification on the propensity score.Journal of the American statistical Association, 79(387):516–524. ...

  17. [9]

    Bi, D., Liu, M., Lin, J., and Liu, R. (2023). BEATS: Bayesian hybrid design with flexible 41 sample size adaptation for time-to-event endpoints.Statistics in Medicine, 42(30):5708– 5722

  18. [9]

    Vanderbeek, A

    Springer. Vanderbeek, A. M., Sabbaghi, A., Walsh, J. R., and Fisher, C. K. (2023). Bayesian prognostic covariate adjustment with additive mixture priors.arXiv:2310.18027. 64 VanderWeele, T. J. (2019). Principles of confounder selection.European journal of epidemi- ology, 34(3):211–219. VanderWeele, T. J. and Ding, P. (2017). Sensitivity analysis in observ...

  19. [10]

    Bi, D., Zhou, T., Zhong, W., and Ji, Y. (2026). SAM-HC: a bayesian nonparametric con- struction of hybrid control for randomized clinical trials using external data.Biostatistics, 27(1):kxag003

  20. [10]

    Tabpfn: One model to rule them all? arXiv preprint arXiv:2505.20003, 2025

    69 Zhang, A., Brown, L. D., and Cai, T. T. (2019). Semi-supervised inference: General theory and estimation of means.The Annals of Statistics, 47(5):2538–2566. Zhang, H., Deng, L., Schiffman, M., Qin, J., and Yu, K. (2020). Generalized integration model for improved statistical inference by leveraging external summary data.Biometrika, 107(3):689–703. Zhan...

  21. [11]

    Bind, M.-A. C. and Rubin, D. B. (2020). When possible, report a Fisher-exactPvalue and display its underlying null randomization distribution.Proceedings of the National Academy of Sciences, 117(32):19151–19158

  22. [11]

    and the conditional outcome meanµa(x) = E[Y|A=a,X =x](Robins, 1986), also referred to as theprognostic score(Hansen, 2008). B.1 Statistical design Covariate selection.Baseline covariatesX can be categorized as (i)confounders, associ- ated with both the outcome and sampling; (ii)precision variables(or prognostic variables), associated only with the outcome...

  23. [12]

    Boughdiri, A., Berenfeld, C., Josse, J., and Scornet, E. (2025). A unified framework for the transportability of population-level causal measures.Advances in Neural Information Processing Systems

  24. [12]

    in sensitivity analyses. B.2 Treatment effect estimation and inference Matching aims to construct an EC sample whose covariate distribution resembles that of treated SAT patients.Nearest-neighbor propensity score matching, often implemented with a caliper of0.1–0.2times the standard deviation of the logit of the propensity score (Austin, 2011), is widely ...

  25. [13]

    L., Chang, T

    Brantner, C. L., Chang, T. H., Nguyen, T. Q., Hong, H., Di Stefano, L., and Stuart, E. A. (2023). Methods for integrating trials and non-experimental data to examine treatment effect heterogeneity.Statistical Science, 38(4):640–654

  26. [13]

    Valid inference for the CATE often requires parametric assumptions or flexible machine learning methods such as the Highly Adaptive Lasso (Nizam et al., 2025)

    and individualized treatment rules (Chu et al., 2023; Wu and Yang, 2023; Zhao et al., 2025a; Li et al., 2024). Valid inference for the CATE often requires parametric assumptions or flexible machine learning methods such as the Highly Adaptive Lasso (Nizam et al., 2025). 76 External summary information and federated learningIn this paper, we focus on scena...

  27. [14]

    Breunig, C., Liu, R., and Yu, Z. (2025). Double robust bayesian inference on average treatment effects.Econometrica, 93(2):539–568

  28. [14]

    subgroup

    and data fusion for policy learning (Williams et al., 2025), 77 multi-regional clinical trials (Li et al., 2026). Auxiliary variablesprovide complementary information that can help integrate ECs when direct outcome exchangeability is doubtful.Control variatesexploit auxiliary statistics that are correlated with the primary outcome but whose expectation is...

  29. [16]

    and Yang, S

    Cao, S. and Yang, S. (2025). Heterogeneity-aware federated causal inference leveraging effect-measure transportability.arXiv:2510.16317

  30. [17]

    Sverdlov, O. (2024). Regulatory guidance on randomization and the use of randomization 42 tests in clinical trials: a systematic review.Statistics in Biopharmaceutical Research, 16(4):428–440

  31. [18]

    Caughey, D., Dafoe, A., Li, X., and Miratrix, L. (2023). Randomisation inference beyond the sharp null: bounded null hypotheses and quantiles of individual treatment effects.Journal of the Royal Statistical Society: Series B (Statistical Methodology), 85(5):1471–1491

  32. [19]

    H., Russo, M., and Paul, S

    Chang, J. H., Russo, M., and Paul, S. (2024). Heterogeneous transfer learning for high dimensional regression with feature mismatch.arXiv:2412.18081

  33. [20]

    Chatterjee, N., Chen, Y.-H., Maas, P., and Carroll, R. J. (2016). Constrained maximum likelihood estimation for model calibration using summary-level information from external big data sources.Journal of the American Statistical Association, 111(513):107–117

  34. [21]

    Chen, C., Liang, J., Chen, E., and Wang, M. (2024a). Advancing information integration through empirical likelihood: Selective reviews and a new idea.arXiv:2407.00561

  35. [22]

    and Ibrahim, J

    Chen, M.-H. and Ibrahim, J. G. (2000). Power prior distributions for regression models. Statistical Science, 15(1):46–60

  36. [23]

    Chen, S., Li, S., Zhang, B., and Ye, T. (2025). Minimax rates and adaptivity in combining experimental and observational data.Journal of Causal Inference, 13(1):20240024

  37. [24]

    Chen, W.-C., Wang, C., Li, H., Lu, N., Tiwari, R., Xu, Y., and Yue, L. Q. (2020). Propensity score-integrated composite likelihood approach for augmenting the control 43 arm of a randomized controlled trial by incorporating real-world data.Journal of Biopharmaceutical Statistics, 30(3):508–520

  38. [25]

    Chen, Y., Feng, Y., Sonksen, M., Wang, T., and Song, J. J. (2026). Propensity score-based stratified win ratio for augmented control designs.Statistics in Medicine, 45:e70487

  39. [26]

    N., and Cai, T

    Cheng, D., Ananthakrishnan, A. N., and Cai, T. (2021). Robust and efficient semi-supervised estimation of average treatment effects with application to electronic health records data. Biometrics, 77(2):413–423

  40. [27]

    Adaptive combination of randomized and observational data.arXiv preprint arXiv:2111.15012,

    Cheng, D. and Cai, T. (2021). Adaptive combination of randomized and observational data. arXiv:2111.15012

  41. [28]

    Cheng, Y., Wu, L., and Yang, S. (2023). Enhancing treatment effect estimation: A model robust approach integrating randomized experiments and external controls using the double penalty integration estimator. InProceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence, volume 216 ofProceedings of Machine Learning Research, pages 381–390

  42. [29]

    C., König, F., and Posch, M

    Chiam, H. C., König, F., and Posch, M. (2025). Selection bias in hybrid randomized controlled trials using external controls: A simulation study.arXiv:2510.04829

  43. [30]

    and Yang, S

    Cho, E. and Yang, S. (2024). Variable selection for doubly robust causal inference.Statistics and its interface, 18(1):93

  44. [31]

    Chu, J., Lu, W., and Yang, S. (2023). Targeted optimal treatment regime learning using summary statistics.Biometrika, 110(4):913–931

  45. [32]

    Colnet, B., Josse, J., Varoquaux, G., and Scornet, E. (2023). Risk ratio, odds ratio, risk difference... which causal measure is easier to generalize?arXiv:2303.16008

  46. [33]

    Colnet, B., Josse, J., Varoquaux, G., and Scornet, E. (2025). Re-weighting the randomized controlled trial for generalization: finite-sample error and variable selection.Journal of the Royal Statistical Society Series A: Statistics in Society, 188(2):345–372

  47. [34]

    Colnet, B., Mayer, I., Chen, G., Dieng, A., Li, R., Varoquaux, G., Vert, J.-P., Josse, J., and Yang, S. (2024). Causal inference methods for combining randomized trials and observational studies: A review.Statistical Science, 39(1):165–191

  48. [35]

    and van der Laan, M

    Coyle, J. and van der Laan, M. J. (2018). Targeted bootstrap. InTargeted Learning in Data Science: Causal Inference for Complex Longitudinal Studies, pages 523–539. Springer

  49. [36]

    Dai, C.-S., Ying, C., Ning, Y., and Zhao, J. (2025). Incorporating external controls for estimating the average treatment effect on the treated with high-dimensional data: Retaining double robustness and ensuring double safety.arXiv:2509.20586. D’Alessandro, A., Kim, J., Adhikari, S., Goff, D., Bargagli-Stoffi, F. J., and Santacatterina, M. (2026). Modern...

  50. [37]

    Dang, L. E. and Balzer, L. B. (2023). Start with the target trial protocol, then follow the roadmap for causal inference.Epidemiology, 34(5):619–623

  51. [38]

    E., Gruber, S., Lee, H., Dahabreh, I

    Dang, L. E., Gruber, S., Lee, H., Dahabreh, I. J., Stuart, E. A., Williamson, B. D., Wyss, R., Díaz, I., Ghosh, D., Kıcıman, E., et al. (2023). A causal roadmap for generating high-quality real-world evidence.Journal of Clinical and Translational Science, 7(1):e212

  52. [39]

    E., Tarp, J

    Dang, L. E., Tarp, J. M., Abrahamsen, T. J., Kvist, K., Buse, J. B., Petersen, M., 45 and van der Laan, M. (2025). Experiment-selector cross-validated targeted maximum likelihood estimator for hybrid rct-external data studies.Journal of Causal Inference, 13(1):20240041. De Bartolomeis, P., Abad, J., Wang, G., Donhauser, K., Duch, R. M., Yang, F., and

  53. [40]

    Dahabreh, I. (2025). Efficient randomized experiments using foundation models. InThe Thirty-ninth Annual Conference on Neural Information Processing Systems

  54. [41]

    and Rose, S

    Degtiar, I. and Rose, S. (2023). A review of generalizability and transportability.Annual Review of Statistics and Its Application, 10(1):501–524

  55. [42]

    Deng, D., Han, P., Chen, S., Wang, M., and Chen, C. (2025). A new integrative learning framework for integrating multiple secondary outcomes into primary outcome analysis: A case study on liver health.Journal of the Royal Statistical Society Series B: Statistical Methodology, page qkaf081. Díaz, I. and van der Laan, M. J. (2013). Sensitivity analysis for ...

  56. [43]

    Duan, Y., Ye, K., and Smith, E. P. (2006). Evaluating water quality using power priors to incorporate historical information.Environmetrics: The Official Journal of the International Environmetrics Society, 17(1):95–106

  57. [44]

    and Hastie, T

    Efron, B. and Hastie, T. (2021).Computer age statistical inference, student edition: algorithms, evidence, and data science, volume 6. Cambridge University Press. EMA (2025). Workshop on the use of external controls in evidence generation for regulatory decision-making

  58. [45]

    Fang, Y., Mishra-Kalyani, P., Zhang, X., Gruber, S., Yang, S., Ding, P., Shan, M., Lee, J.-Y., van der Laan, M., Faries, D., et al. (2025). Sensitivity analysis for unmeasured 46 confounding in medical product development and evaluation using real world evidence. Statistics in Biopharmaceutical Research, pages 1–12

  59. [46]

    Sheffield, K., and Dreyer, N. (2025). Real effect or bias? good practices for evaluating the robustness of evidence from comparative observational studies through quantitative sensitivity analysis for unmeasured confounding.Pharmaceutical Statistics, 24(2):e2457. FDA (2019a). Adaptive design clinical trials for drugs and biologics: Guidance for industry. ...

  60. [47]

    Fiksel, J. (2024). On exact randomization-based covariate-adjusted confidence intervals. Biometrics, 80(2):ujae051

  61. [48]

    Fisher, R. A. (1935).The Design of Experiments. Oliver and Boyd, Edinburgh, 1st edition

  62. [49]

    and Korn, E

    Freidlin, B. and Korn, E. L. (2013). Borrowing information across subgroups in phase ii trials: is it useful?Clinical Cancer Research, 19(6):1326–1334. 47

  63. [50]

    and Korn, E

    Freidlin, B. and Korn, E. L. (2023). Augmenting randomized clinical trial data with historical control data: precision medicine applications.JNCI: Journal of the National Cancer Institute, 115(1):14–20

  64. [51]

    A., Sales, A

    Gagnon-Bartsch, J. A., Sales, A. C., Wu, E., Botelho, A. F., Erickson, J. A., Miratrix, L. W., and Heffernan, N. T. (2023). Precise unbiased estimation in randomized experiments using auxiliary observational data.Journal of Causal Inference, 11(1):20220011

  65. [52]

    Galwey, N. (2017). Supplementation of a clinical trial by historical control data: is the prospect of dynamic borrowing an illusion?Statistics in Medicine, 36(6):899–916

  66. [54]

    and Schuler, A

    Gordon, A. and Schuler, A. (2025). A non-parametric sensitivity analysis for bounding bias in hybrid control trials.arXiv:2507.18876

  67. [55]

    Gravestock, I., Held, L., and consortium, C.-N. (2017). Adaptive power priors with empirical bayes for clinical trials.Pharmaceutical statistics, 16(5):349–360

  68. [56]

    Greifer, N. (2020). Covariate balance tables and plots: a guide to the cobalt package. Accessed March, 10:2020

  69. [57]

    V., Lee, H., Concato, J., and van der Laan, M

    Gruber, S., Phillips, R. V., Lee, H., Concato, J., and van der Laan, M. (2023). Evaluating and improving real-world evidence with targeted learning.BMC Medical Research Methodology, 23(1):178. 48

  70. [58]

    Gu, Y., Liu, H., and Ma, W. (2024). Incorporating external data for analyzing randomized clinical trials: A transfer learning approach.arXiv:2409.04126

  71. [59]

    Guo, B., Laird, G., Song, Y., Chen, J., and Yuan, Y. (2024). Adaptive hybrid control design for comparative clinical trials with historical control data.Journal of the Royal Statistical Society Series C: Applied Statistics, 73(2):444–459

  72. [60]

    and Basse, G

    Guo, K. and Basse, G. (2023). The generalized oaxaca-blinder estimator.Journal of the American Statistical Association, 118(541):524–536

  73. [61]

    L., Ding, P., Wang, Y., and Jordan, M

    Guo, W., Wang, S. L., Ding, P., Wang, Y., and Jordan, M. (2022). Multi-source causal inference using control variates under outcome selection bias.Transactions on Machine Learning Research

  74. [62]

    Degtyarev, E. (2024). Combining the target trial and estimand frameworks to define the causal estimand: an application using real-world data to contextualize a single-arm trial. Statistics in Biopharmaceutical Research, 16(1):1–10

  75. [63]

    Hampson, L. V. and Izem, R. (2023). Innovative hybrid designs and analytical approaches leveraging real-world data and clinical trial data.Real-World Evidence in Medical Product Development, pages 211–232

  76. [64]

    Han, L., Hou, J., Cho, K., Duan, R., and Cai, T. (2025). Federated adaptive causal estimation (FACE) of target treatment effects.Journal of the American Statistical Association, 120(551):1503–1516

  77. [65]

    Han, L., Shen, Z., and Zubizarreta, J. (2023). Multiply robust federated estimation of targeted average treatment effects.Advances in Neural Information Processing Systems, 36:70453–70482

  78. [66]

    K., Mukherjee, B., and Taylor, J

    Han, P., Li, H., Park, S. K., Mukherjee, B., and Taylor, J. M. (2024). Improving predic- 49 tion of linear regression models by integrating external information from heterogeneous populations: James–stein estimators.Biometrics, 80(3):ujae072

  79. [67]

    Hansen, B. B. (2008). The prognostic analogue of the propensity score.Biometrika, 95(2):481–488

  80. [68]

    Harton, J., Segal, B., Mamtani, R., Mitra, N., and Hubbard, R. A. (2023). Combining real-world and randomized control trial data using data-adaptive weighting via the on-trial score.Statistics in Biopharmaceutical Research, 15(2):408–420. Hernán, M. A. and Robins, J. M. (2016). Using big data to emulate a target trial when a randomized trial is not availa...

Showing first 80 references.