Optimal sequential two-stage Bayes Factor Design for two-arm clinical Phase II Trials with binary Endpoints
Pith reviewed 2026-06-28 13:08 UTC · model grok-4.3
The pith
Exact correction formulas allow simulation-free calibration of optimal two-stage Bayes factor designs for two-arm phase II trials.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that Bayesian power and type-I error for a two-stage two-arm design with a single futility interim equal the corresponding fixed-sample quantities minus the contributions from trajectories removed at the interim, and that these exact corrections enable a fully numerical search over admissible interim and final sample sizes to minimize expected sample size under the null subject to constraints on power, type-I error, and the probability of compelling evidence for the null.
What carries the argument
The correction formulas that adjust fixed-sample Bayes factor operating characteristics for trajectories removed by early stopping at the futility interim.
If this is right
- The procedure yields designs whose operating characteristics are known exactly rather than estimated with simulation error.
- The search identifies the admissible design that minimizes expected sample size under the null while satisfying the target constraints.
- The same numerical calibration applies directly to re-analysis of completed two-arm phase II trials such as the riociguat trial.
- Sensitivity checks on prior choices or target thresholds become straightforward because each candidate design is evaluated without Monte Carlo variability.
Where Pith is reading between the lines
- The correction approach could be tested on designs that allow early stopping for superiority as well as futility.
- Extension to three or more stages would require deriving analogous but more involved trajectory corrections.
- The method might be applied to non-binary endpoints once fixed-sample Bayes factor formulas for those endpoints are available.
- Trial planners in other disease areas could adopt the same search to reduce average patient exposure under the null hypothesis.
Load-bearing premise
The correction formulas that work for one-arm two-stage designs remain valid when extended to the two-arm binary-endpoint case with one futility interim.
What would settle it
Compute the Bayesian power and type-I error of the derived optimal design both with the exact correction formulas and with an independent Monte Carlo simulation of the same design; mismatch between the two values would falsify the claim.
Figures
read the original abstract
Two-arm phase II clinical trials often benefit from an interim analysis that allows early stopping for futility, but Bayesian calibration of such designs is usually based on computationally intensive Monte Carlo simulation. In this work, a simulation-free methodology is developed to obtain Bayesian optimal two-stage designs in two-arm phase II trials with binary endpoints using Bayes factors as the primary measure of evidence. Building on recent matrix-search methods for fixed-sample two-arm Bayes factor designs and earlier correction formulas for one-arm two-stage designs, the proposed approach derives exact expressions for the operating characteristics of a two-stage two-arm design with a single futility interim. Bayesian power and type-I error are obtained by correcting the corresponding fixed-sample quantities for trajectories that would have been removed by early stopping, yielding a fully numerical calibration procedure that avoids Monte Carlo error entirely. The resulting method searches over admissible interim and final sample sizes to identify the optimal design that satisfies target constraints on Bayesian power, type-I error, and the probability of compelling evidence in favour of the null hypothesis, while minimizing the expected sample size under the null hypothesis. The methodology is illustrated in realistic phase II settings, including a detailed re-analysis of the riociguat trial in systemic sclerosis. Overall, the approach extends simulation-free Bayes factor design methodology to the practically important setting of two-arm two-stage phase II trials and provides a transparent basis for Bayesian design calibration and sensitivity analysis.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript develops a simulation-free methodology for Bayesian optimal two-stage two-arm phase II designs with binary endpoints and a single futility interim, using Bayes factors. It builds on matrix-search methods for fixed-sample two-arm designs and prior one-arm correction formulas to derive exact expressions for operating characteristics by subtracting probabilities of early-stopping trajectories from fixed-sample quantities, yielding a fully numerical calibration procedure that avoids Monte Carlo error. The approach searches admissible interim and final sample sizes to minimize expected sample size under the null subject to constraints on Bayesian power, type-I error, and probability of compelling evidence for the null, and illustrates the method on realistic settings including a re-analysis of the riociguat trial.
Significance. If the central extension holds, the work provides a clear advance by delivering exact, reproducible Bayesian design calibration for two-arm two-stage trials without simulation error. The numerical optimization under explicit constraints on power, type-I error, and null evidence probability, together with the focus on expected sample size under the null, supplies a transparent basis for sensitivity analysis and practical phase II planning that extends prior one-arm and fixed-sample results.
major comments (1)
- [Methodology (abstract description and §3)] The load-bearing step is the claim that the one-arm correction formulas extend exactly to the two-arm bivariate binomial setting (abstract, methodology paragraph). In the two-arm case the interim data form a pair of counts whose joint distribution is bivariate, the Bayes factor depends on both arms, and removable trajectories are defined on the product space; any mismatch in enumeration or re-weighting would make the reported Bayesian power and type-I error inexact. The manuscript must supply an explicit derivation or worked numerical example (e.g., in the methods section) showing how the joint-path corrections are constructed from the fixed-sample matrix-search quantities, rather than asserting direct inheritance from the one-arm case.
Simulated Author's Rebuttal
We thank the referee for their thorough review and for highlighting the need for greater transparency in the methodological extension. We address the single major comment below and will incorporate the requested material in the revision.
read point-by-point responses
-
Referee: [Methodology (abstract description and §3)] The load-bearing step is the claim that the one-arm correction formulas extend exactly to the two-arm bivariate binomial setting (abstract, methodology paragraph). In the two-arm case the interim data form a pair of counts whose joint distribution is bivariate, the Bayes factor depends on both arms, and removable trajectories are defined on the product space; any mismatch in enumeration or re-weighting would make the reported Bayesian power and type-I error inexact. The manuscript must supply an explicit derivation or worked numerical example (e.g., in the methods section) showing how the joint-path corrections are constructed from the fixed-sample matrix-search quantities, rather than asserting direct inheritance from the one-arm case.
Authors: We agree that an explicit derivation and a small-scale numerical illustration are necessary to make the bivariate extension fully transparent. The current manuscript states that exact expressions are obtained by subtracting the probabilities of early-stopping trajectories from the fixed-sample matrix-search quantities, but does not walk through the joint enumeration. In the revised version we will insert a new subsection (likely §3.2) that (i) defines the product-space trajectories for the pair of binomial counts, (ii) shows how the removable set is identified by evaluating the Bayes factor at the interim look, and (iii) provides a fully worked numerical example with n1=5 per arm, n=15 per arm, and explicit probability calculations for the four possible interim outcomes. This addition will confirm that the correction remains exact and that no Monte-Carlo error is introduced. revision: yes
Circularity Check
No significant circularity; derivation extends prior independent methods without self-referential reduction
full rationale
The paper explicitly builds its two-stage correction formulas on cited prior matrix-search methods for fixed-sample designs and one-arm correction formulas. The central step—subtracting probabilities of early-stopping trajectories from fixed-sample Bayes factor operating characteristics to obtain exact two-arm quantities—is presented as a direct mathematical extension rather than a redefinition or fit of the target quantities themselves. No equation reduces the claimed exact Bayesian power or type-I error to a fitted parameter or self-citation chain by construction; the optimization over admissible sample sizes under explicit constraints is a standard numerical search independent of the paper's own outputs. The extension to bivariate binomial trajectories is the novel content and does not collapse to the inputs.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Why Should Clinicians Care about
Matthews, Robert , year = 2001, month = mar, journal =. Why Should Clinicians Care about. doi:10.1016/S0378-3758(00)00232-9 , urldate =
-
[2]
Science , volume =
Estimating the Reproducibility of Psychological Science , author =. Science , volume =
-
[4]
, year = 2010, month = jan, journal =
Aase, Knut K. , year = 2010, month = jan, journal =. On the. doi:10.1080/034612301750077356 , abstract =
-
[5]
Sankhya: The Indian Journal of Statistics , volume =
Bayesian Maximum a Posteriori Multiple Testing Procedure , author =. Sankhya: The Indian Journal of Statistics , volume =
-
[6]
, year = 2001, publisher =
Achinstein, P. , year = 2001, publisher =. The
2001
-
[7]
, editor =
Achinstein, P. , editor =. Mill's. Error and
-
[8]
Aczel, Balazs and Hoekstra, Rink and Gelman, Andrew and Wagenmakers, Eric Jan and Klugkist, Irene G. and Rouder, Jeffrey N. and Vandekerckhove, Joachim and Lee, Michael D. and Morey, Richard D. and Vanpaemel, Wolf and Dienes, Zoltan and. Discussion Points for. Nature Human Behaviour , volume =. doi:10.1038/s41562-019-0807-z , urldate =
-
[9]
, year = 2009, publisher =
Adams, William J. , year = 2009, publisher =. The
2009
-
[10]
Adams, Ernest W. , year = 1975, publisher =. The. doi:10.1007/978-94-015-7622-2_1 , file =
-
[11]
, year = 2009, edition =
Adams, William J. , year = 2009, edition =. The
2009
-
[12]
Adjerid, Idris and Kelley, Ken , year = 2018, journal =. Big. doi:10.1037/amp0000190 , urldate =
-
[13]
Akaike, Hirotugu , year = 1974, journal =. A. doi:10.1109/TAC.1974.1100705 , urldate =
-
[14]
Statistical Journal of the IAOS , volume =
Qatar. Statistical Journal of the IAOS , volume =
-
[15]
The Impact of a. 2016. doi:10.1109/SAI.2016.7556080 , abstract =
-
[17]
Alardawi, A. S. and Agil, A. M. , year = 2015, month = jun, pages =. Novice Comprehension of. 2015. doi:10.1109/WCITCA.2015.7367057 , abstract =
-
[18]
The Statistician
Albers, Casper , year = 2017, journal =. The Statistician
2017
-
[19]
, year = 1990, month = dec, journal =
Albert, James H. , year = 1990, month = dec, journal =. A. doi:10.2307/3315841 , urldate =
-
[20]
, year = 1997, month = jun, journal =
Albert, James H. , year = 1997, month = jun, journal =. Bayesian. doi:10.2307/2965716 , urldate =
-
[21]
Probability and
Albert, Jim and Hu, Jingchen , year = 2020, publisher =. Probability and
2020
-
[22]
Aldrich, John , year = 1997, journal =. R. doi:10.1214/ss/1030037906 , abstract =
-
[23]
Aldrich, John , year = 2005, journal =. The. 25472677 , eprinttype =
2005
-
[24]
Aldrich, John , year = 2006, journal =. The. doi:10.1111/j.1751-5823.2005.tb00150.x , abstract =
-
[25]
Aldrich, John , year = 2008, journal =. R
2008
-
[26]
Survival Analysis of All Critically Ill Patients with
Ali, Mohamed Mahmoud and Malik, Mamunur Rahman and Ahmed, Abdulrazaq Yusuf and Bashir, Ahmed Muhammad and Mohamed, Abdulmunim and Abdi, Abdulkadir and Obtel, Majdouline , year = 2022, month = jan, journal =. Survival Analysis of All Critically Ill Patients with. doi:10.1016/J.IJID.2021.11.018 , urldate =
-
[27]
Alonso, F. and Manrique, D. and Martinez, L. and Vines, J. M. , year = 2011, month = aug, journal =. How. doi:10.1109/TE.2010.2083665 , abstract =
-
[28]
Altekar, G. and Dwarkadas, S. and Huelsenbeck, J. P. and Ronquist, F. , year = 2004, month = feb, journal =. Parallel. doi:10.1093/bioinformatics/btg427 , urldate =
-
[29]
and Dwarkadas, S
Altekar, G. and Dwarkadas, S. and Huelsenbeck, J. P. and Ronquist, F. , year = 2004, month = feb, journal =. Parallel
2004
-
[30]
Statistics in Medicine , volume =
Statistics in Medical Journals , author =. Statistics in Medicine , volume =. doi:10.1002/sim.4780010109 , urldate =
-
[31]
Statistical Guidelines for Contributors to Medical Journals. , author =. British Medical Journal (Clinical research ed.) , volume =. doi:10.1136/bmj.286.6376.1489 , file =
-
[32]
Practical Statistics for Medical Research , author =
-
[33]
Altman, Douglas G. , year = 1991, journal =. Statistics in Medical Journals:. doi:10.1002/sim.4780101206 , abstract =
-
[34]
, year = 2000, journal =
Altman, Douglas G. , year = 2000, journal =. Statistics in Medical Journals:
2000
-
[35]
, year = 2002, month = jun, journal =
Altman, Douglas G. , year = 2002, month = jun, journal =. Poor-Quality Medical Research:. doi:10.1001/jama.287.21.2765 , urldate =
-
[37]
Unraveling the Complex Interplay between Obesity and Vitamin
Alzohily, Bashar and AlMenhali, Asma and Gariballa, Salah and Munawar, Nayla and Yasin, Javed and Shah, Iltaf , year = 2024, month = dec, journal =. Unraveling the Complex Interplay between Obesity and Vitamin. doi:10.1038/S41598-024-58154-Z , abstract =
-
[38]
doi:10.1109/TIT.2009.2030485 , abstract =
Amari, Shun Ichi , year = 2009, journal =. doi:10.1109/TIT.2009.2030485 , abstract =
-
[39]
Divergence, Optimization and Geometry , booktitle =
Amari, Shun Ichi , year = 2009, volume =. Divergence, Optimization and Geometry , booktitle =. doi:10.1007/978-3-642-10677-4_21 , abstract =
-
[40]
Anderson localization in an interacting fermionic system
Amrhein, Valentin and Trafimow, David and Greenland, Sander , year = 2019, journal =. Inferential. doi:10.1080/00031305.2018.1543137 , abstract =
work page internal anchor Pith review Pith/arXiv arXiv doi:10.1080/00031305.2018.1543137 2019
-
[42]
Anderson, Sharon and Hauck, Walter W. , year = 1983, journal =. A. doi:10.1080/03610928308828634 , abstract =
-
[43]
, year = 1986, journal =
Anderson, Herbert L. , year = 1986, journal =. Metropolis,
1986
-
[44]
Anderson, Samantha F. and Maxwell, Scott E. , year = 2017, month = may, journal =. Addressing the ``. doi:10.1080/00273171.2017.1289361 , urldate =
-
[45]
Anderson localization in an interacting fermionic system
Anderson, Andrew A. , year = 2019, journal =. Assessing. doi:10.1080/00031305.2018.1537889 , abstract =
work page internal anchor Pith review Pith/arXiv arXiv doi:10.1080/00031305.2018.1537889 2019
-
[46]
and Kelley, Ken , year = 2022, journal =
Anderson, Samantha F. and Kelley, Ken , year = 2022, journal =. Sample. doi:10.1037/MET0000520 , urldate =
-
[47]
Anderson, T. W. and Darling, D. A. , year = 1952, month = jun, journal =. Asymptotic. doi:10.1214/aoms/1177729437 , urldate =
-
[48]
Andres, Antonio Mart. Comments on '. Statistics in Medicine , volume =. doi:10.1002/sim.3169 , urldate =
-
[50]
Andrews, Mark and Baguley, Thom , year = 2013, month = feb, journal =. Prior Approval:. doi:10.1111/bmsp.12004 , urldate =
-
[51]
Stability of Stochastic Approximation under Verifiable Conditions , author =. Proceedings of the 44th IEEE Conference on Decision and Control, and the European Control Conference, CDC-ECC '05 , volume =. doi:10.1109/CDC.2005.1583231 , abstract =
-
[52]
Andrieu, Christophe and Thoms, Johannes , year = 2008, journal =. A Tutorial on Adaptive. doi:10.1007/s11222-008-9110-y , abstract =
-
[53]
Andrieu, Christophe and Doucet, Arnaud and Holenstein, Roman , year = 2010, month = jun, journal =. Particle. doi:10.1111/J.1467-9868.2009.00736.X , urldate =
-
[54]
Anscombe, F.J. , year = 1963, journal =. Sequential. doi:10.1080/00401706.1976.10489459 , file =
-
[55]
Arandjelovi. A More Principled Use of the. Royal Society Open Science , volume =. doi:10.1098/rsos.181519 , urldate =
-
[56]
Arjas, Elja and Gasbarra, Dario , year = 2026, month = mar, number =. Is Control of Type. doi:10.48550/arXiv.2312.15222 , urldate =. arXiv , keywords =:2312.15222 , primaryclass =
-
[57]
Arrowsmith, John , year = 2011, month = feb, journal =. Phase. doi:10.1038/nrd3375 , urldate =
-
[58]
Arrowsmith, John , year = 2011, month = may, journal =. Trial Watch:. doi:10.1038/NRD3439 , urldate =
-
[59]
Semaglutid (. Wiss. Fachausschuss der Bundes\"arztekammer , urldate =
-
[60]
Semaglutid (
der deutschen. Semaglutid (
-
[61]
American. doi:10.1080/00031305.2016.1154108.Vt2XIOaE2MN , abstract =
work page internal anchor Pith review Pith/arXiv arXiv doi:10.1080/00031305.2016.1154108.vt2xioae2mn 2016
-
[62]
Ashby, Deborah and Smith, Adrian F.M. , year = 2000, journal =. Evidence-Based Medicine as. doi:10.1002/1097-0258(20001215)19:23<3291::AID-SIM627>3.0.CO;2-T , abstract =
-
[63]
Bayesian Statistics in Medicine:
Ashby, Deborah , year = 2006, month = nov, journal =. Bayesian Statistics in Medicine:. doi:10.1002/sim.2672 , urldate =. 16947924 , eprinttype =
-
[64]
BMC Medical Research Methodology , volume =
Chi-Square Test under Indeterminacy: An Application Using Pulse Count Data , author =. BMC Medical Research Methodology , volume =. doi:10.1186/S12874-021-01400-Z/FIGURES/1 , urldate =
-
[65]
Guidelines for
Association, American Statistical , year = 2005, number =. Guidelines for
2005
-
[66]
Association, American Statistical , year = 2016, journal =. Guidelines for. doi:10.3928/01484834-20140325-01 , abstract =
-
[68]
Astrachan, Owen and Bruce, Kim and Koffman, Elliot and K. Resolved:. Proceedings of the 36th. doi:10.1145/1047344.1047359 , abstract =
-
[69]
Azen, Razia and Budescu, David , editor =. Applications of. The. doi:10.4135/9780857020994.n13 , file =
-
[70]
Laplace's. Proceedings of the. doi:10.1016/b978-1-55860-332-5.50009-2 , urldate =
-
[71]
Design and Statistical Analysis of Oral Medicine Studies:
Baccaglini, L and Shuster, J J and Cheng, J and Theriaque, D W and Schoenbach, V J and Tomar, S L and Poole, C , year = 2010, month = apr, journal =. Design and Statistical Analysis of Oral Medicine Studies:. doi:10.1111/j.1601-0825.2009.01634.x , urldate =. 19874532 , eprinttype =
-
[72]
Bai, Yan and Craiu, Radu V. and Di Narzo, Antonio F. , year = 2011, month = jan, journal =. Divide and. doi:10.1198/jcgs.2010.09035 , urldate =
-
[73]
Refinement of Experimental Design and Conduct in Laboratory Animal Research , author =. ILAR Journal , volume =. doi:10.1093/ILAR/ILU037 , urldate =
-
[74]
Gu, Xin and Hoijtink, Herbert and Mulder, Joris and. Bain:
-
[75]
Is There a Reproducibility Crisis? , author =. Nature , volume =. doi:10.1038/533452A , abstract =
-
[76]
Reproducibility Project Yields Muddy Results:
Baker, Monya and Dolgin, Elie , year = 2017, journal =. Reproducibility Project Yields Muddy Results:. doi:10.1038/541269a , file =
-
[77]
Postgraduate medicine , volume =
Odds Ratio vs Risk Ratio in Randomized Controlled Trials , author =. Postgraduate medicine , volume =. doi:10.1080/00325481.2015.1022494 , urldate =
-
[78]
Balzarini, R. N. and Bobson, K. and Chin, K. and Campbell, L. , year = 2017, journal =. Does Exposure to Erotica Reduce Attraction and Love for Romantic Partners in Men?
2017
-
[79]
Bancroft, T. A. , year = 1944, month = jun, journal =. On. doi:10.1214/aoms/1177731284 , urldate =
-
[81]
Bandyoapdhyay, Prasanta S. and Nelson, Davin and Greenwood, Mark and Brittan, Gordon and Berwald, Jesse , year = 2011, month = jul, journal =. The Logic of. doi:10.1007/S11229-010-9797-0/METRICS , abstract =
-
[82]
Bandyopadhyay, Dipankar and Reich, Brian J and Slate, Elizabeth H , year = 2009, month = dec, volume =. Bayesian Modeling of Multivariate Spatial Binary Data with Applications to Dental Caries , booktitle =. doi:10.1002/sim.3647 , urldate =. 19902498 , eprinttype =
-
[83]
, editor =
Bandyopadhyay, Prasanta S. , editor =. Philosophy of
-
[84]
Truths about
Bandyopadhyay, Prasanta S and Raghavan, R Venkata and Dcruz, Don Wallace and Brittan, Gordon , year = 2015, pages =. Truths about. Indian
2015
-
[85]
and Raghavan, R
Bandyopadhyay, Prasanta S. and Raghavan, R. Venkata and Dcruz, Don Wallace and Brittan, Gordon , year = 2015, pages =. Truths about. Indian
2015
-
[86]
Banks, David L. , year = 1996, journal =. A Conversation with. doi:10.1214/ss/1032209661 , abstract =
-
[87]
Dong, Chen and Wedel, Michel , year = 2019, howpublished =
2019
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.