Anytime-valid testing with e-values and confirmatory adaptive designs

Lasse Fischer; Werner Brannath

arxiv: 2606.00878 · v1 · pith:4YEE36ZQnew · submitted 2026-05-30 · 📊 stat.ME

Anytime-valid testing with e-values and confirmatory adaptive designs

Werner Brannath , Lasse Fischer This is my paper

Pith reviewed 2026-06-28 18:01 UTC · model grok-4.3

classification 📊 stat.ME

keywords confirmatory adaptive designse-valuesanytime-valid inferencecombination testsconditional error functionssequential testingclinical trials

0 comments

The pith

Confirmatory adaptive designs are formally equivalent to e-value based anytime-valid sequential tests.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper establishes that tools from confirmatory adaptive designs, such as conditional error functions and combination tests, are mathematically equivalent to e-value based tests that support anytime-valid inference. The equivalence shows that two independently developed approaches to flexible statistical testing reach the same underlying mechanism. A sympathetic reader would care because the result unifies methods that permit mid-study adaptations like sample size re-assessments or endpoint selection while preserving validity. The work further contrasts their emphases, noting that adaptive designs typically aim to exhaust type I error under the allowed flexibility whereas e-value methods stress optional continuation, level choice, and loss-function control. It indicates routes for each approach to inform the other.

Core claim

Adaptive design tools like conditional error functions and combination tests are formally equivalent to e-value based, anytime-valid sequential tests. The two frameworks share the goal of introducing flexibility into statistical inference yet differ in focus: combination tests and conditional error functions generally seek to exhaust type I error rates, while e-value testing additionally emphasizes optional stopping, chosen significance levels, and extensions to loss functions. The equivalence is shown under the standard constructions given in the respective literatures.

What carries the argument

The formal mapping between conditional error functions, combination tests, and e-values that establishes their equivalence for sequential testing.

If this is right

E-value methods can supply optional-continuation properties to confirmatory adaptive designs.
Adaptive design techniques can tighten error-rate control within e-value frameworks.
The equivalence allows direct transfer of level choice and loss-function extensions between the two areas.
Clinical trial protocols can adopt elements from both literatures without violating validity.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The mapping may let anytime-validity guarantees move directly into existing adaptive trial software.
Hybrid procedures could be built that use e-value optional stopping inside a conditional error function skeleton.
Similar equivalences might be checked for other sequential methods such as group-sequential boundaries.

Load-bearing premise

The claimed equivalence holds under the specific constructions of conditional error functions, combination tests, and e-values as defined in their respective literatures.

What would settle it

A concrete counterexample of a conditional error function or combination test that cannot be rewritten as an e-value (or vice versa) under the paper's definitions would falsify the equivalence.

read the original abstract

Confirmatory adaptive designs were introduced more than 30 years ago and enable for example sample size re-assessments and the selection of treatments, endpoints as well as subpopulations during the course of a clinical trial. Recently, sequential tests based on e-values for an anytime-valid inference have been developed, promising seemingly similar or even more flexibility and utility. In this note, we compare these two independently developed concepts, shedding light on their formal and methodological connections and differences. Specifically, we show that adaptive design tools like conditional error functions and combination tests are formally equivalent to e-value based, anytime-valid sequential tests. However, in spite of their common fundamental intention to bring flexibility into statistical inference, they have quite different emphases: While hypothesis testing with combination tests and conditional error function usually intent to exhaust type I error rates under the offered flexibility, e-value based testing aims on the additional flexibility with regard to optional continuation, the chosen level and, in recent extensions, in the loss functions to be controlled. We also indicate how recent e-value achievements could enrich clinical trial methodology and adaptive design methodology could inspire and improve e-value based testing.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper's core contribution is showing a formal equivalence between e-value anytime-valid tests and adaptive design tools like conditional error functions, presented as a new link between the two literatures.

read the letter

The main thing to know is that this note claims adaptive design tools (conditional error functions and combination tests) are formally equivalent to e-value based anytime-valid sequential tests. The authors treat the explicit mapping as a fresh observation not already in the cited literature.

The paper does a solid job laying out the shared goal of flexible inference in clinical trials while spelling out the different emphases. Adaptive designs usually aim to exhaust type I error under the allowed adaptations, whereas e-value methods prioritize optional continuation, level choice, and loss-function control. The note also flags concrete ways each side could borrow from the other, which is a useful pointer.

The soft spots are limited. The equivalence is asserted for the standard constructions in each field, and the stress-test finds no obvious internal break or hidden assumption that would invalidate the claim. Still, the actual derivation steps are not visible in the abstract, so the strength of the result depends on whether the mapping is exact or requires particular choices. The scope stays inside biostatistics, so the work does not reach beyond that subfield.

This is for readers already working on sequential methods or confirmatory adaptive designs. Someone in those areas would get value from the direct comparison. It deserves serious referee time because a clean equivalence can help the two communities exchange tools, even if the result is modest in scale.

Referee Report

0 major / 2 minor

Summary. The manuscript claims that confirmatory adaptive design tools, specifically conditional error functions and combination tests, are formally equivalent to e-value based anytime-valid sequential tests. It contrasts their emphases—exhausting type I error under flexibility versus additional options for continuation, level choice, and loss-function control—and indicates potential cross-enrichment between the literatures.

Significance. If the claimed formal equivalence is established under standard definitions, the note provides a bridge between two bodies of work on flexible inference, which could allow transfer of techniques such as loss-function extensions from e-values into clinical trial designs or adaptive-design ideas into sequential e-value procedures.

minor comments (2)

The abstract asserts the equivalence but the manuscript would benefit from an explicit statement (e.g., in the introduction or a dedicated section) of the precise constructions under which the mapping holds and any restrictions that would break it.
Notation for the mapping between conditional error functions/combination tests and e-values should be introduced once and used consistently to aid readability of the equivalence argument.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive assessment of our manuscript and the recommendation for minor revision. No specific major comments were provided in the report.

Circularity Check

0 steps flagged

No significant circularity; formal equivalence between independent frameworks

full rationale

The paper's central claim is a formal equivalence mapping between two pre-existing families of procedures (conditional error functions/combination tests from adaptive design literature, and e-value based anytime-valid tests) under their standard definitions. No derivation chain reduces a result to its own inputs by construction, no fitted parameters are relabeled as predictions, and no load-bearing premise rests on a self-citation chain. The abstract and described contribution treat the two bodies of work as independently developed, with the paper only exhibiting the mapping and noting differing emphases. This is a self-contained observation of equivalence rather than a constructed result.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The paper rests on standard definitions from probability theory and existing statistical methodology for e-values and adaptive designs; no new free parameters, ad-hoc axioms, or invented entities are introduced in the abstract.

axioms (1)

standard math Standard axioms of probability and the mathematical definitions of e-values, conditional error functions, and combination tests as established in prior literature.
The equivalence mapping depends on these background definitions.

pith-pipeline@v0.9.1-grok · 5720 in / 1111 out tokens · 25289 ms · 2026-06-28T18:01:54.327955+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

48 extracted references · 6 canonical work pages · 1 internal anchor

[1]

Multistage testing with adaptive designs.Biometrie und Informatik in Medizin und Biologie, 20(4):130–148, 1989

Peter Bauer. Multistage testing with adaptive designs.Biometrie und Informatik in Medizin und Biologie, 20(4):130–148, 1989

1989
[2]

Combining different phases in the development of medical treatments within a single trial.Statistics in Medicine, pages 1833–1848, 1999

Peter Bauer and Meinhard Kieser. Combining different phases in the development of medical treatments within a single trial.Statistics in Medicine, pages 1833–1848, 1999

1999
[3]

Evaluation of experiments with adaptive interim analyses.Biometrics, 50:1029–1041, 1994

Peter Bauer and Karl Köhne. Evaluation of experiments with adaptive interim analyses.Biometrics, 50:1029–1041, 1994. (Correction in 1996 Biometrics, 52, 380)

1994
[4]

Yoav Benjamini and Yosef Hochberg. Controlling the false discovery rate: a practical and powerful approach to multiple testing.Journal of the Royal Statistical Society Series B: Statistical Methodology, 57(1):289–300, 1995

1995
[5]

Multiple hypotheses testing with weights.Scandinavian Journal of Statistics, 24(3):407–418, 1997

Yoav Benjamini and Yosef Hochberg. Multiple hypotheses testing with weights.Scandinavian Journal of Statistics, 24(3):407–418, 1997

1997
[6]

Recursive combination tests

Werner Brannath, Martin Posch, and Peter Bauer. Recursive combination tests. pages 236–244, 2002

2002
[7]

Probabilistic foundation of confirmatory adaptive designs.Journal of the American Statistical Association, 107(498):824–832, 2012

Werner Brannath, Georg Gutjahr, and Peter Bauer. Probabilistic foundation of confirmatory adaptive designs.Journal of the American Statistical Association, 107(498):824–832, 2012

2012
[8]

The population-wise error rate for clinical trials with overlapping populations.Statistical Methods in Medical Research, 32(2): 334–352, 2023

Werner Brannath, Charlie Hillner, and Kornelius Rohmeyer. The population-wise error rate for clinical trials with overlapping populations.Statistical Methods in Medical Research, 32(2): 334–352, 2023

2023
[9]

Optimal gambling systems for favourable games

Leo Breiman. Optimal gambling systems for favourable games. InFourth Berkeley Symposium on Mathematical Statistics and Probability, pages 65–78, 1961

1961
[10]

A graphical approach to sequentially rejective multiple test procedures.Statistics in Medicine, 28(4):586–604, 2009

Frank Bretz, Willi Maurer, Werner Brannath, and Martin Posch. A graphical approach to sequentially rejective multiple test procedures.Statistics in Medicine, 28(4):586–604, 2009. 11

2009
[11]

Improving wald’s (approximate) sequential probability ratio test by avoiding overshoot.IEEE Transactions on Information Theory, (4):2457–2471, 2026

Lasse Fischer and Aaditya Ramdas. Improving wald’s (approximate) sequential probability ratio test by avoiding overshoot.IEEE Transactions on Information Theory, (4):2457–2471, 2026

2026
[12]

Safe testing.Journal of the Royal Statistical Society Series B: Statistical Methodology (with discussion), 2024

Peter Grünwald, Rianne de Heide, and Wouter M Koolen. Safe testing.Journal of the Royal Statistical Society Series B: Statistical Methodology (with discussion), 2024

2024
[13]

Beyond neyman–pearson: E-values enable hypothesis testing with a data-driven alpha.Proceedings of the National Academy of Sciences, 121(39):e2302098121, 2024

Peter D Grünwald. Beyond neyman–pearson: E-values enable hypothesis testing with a data-driven alpha.Proceedings of the National Academy of Sciences, 121(39):e2302098121, 2024

2024
[14]

Family-wise Error Rate Control with E-values

Will Hartog and Lihua Lei. Family-wise error rate control with e-values.arXiv preprint arXiv:2501.09015, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[15]

Adaptive modifications of hypotheses after an interim analysis.Biometrical Journal, pages 581–589, 2001

Gerhard Hommel. Adaptive modifications of hypotheses after an interim analysis.Biometrical Journal, pages 581–589, 2001

2001
[16]

Powerful short-cuts for multiple testing procedures with special reference to gatekeeping strategies.Statistics in Medicine, pages 4063–73,

Gerhard Hommel, Bretz Frank, and Maurer Willi. Powerful short-cuts for multiple testing procedures with special reference to gatekeeping strategies.Statistics in Medicine, pages 4063–73,
[17]

doi: 10.1002/sim.2873

work page doi:10.1002/sim.2873
[18]

A new interpretation of information rate.The Bell System Technical Journal, 35 (4):917–926, 1956

John L Kelly. A new interpretation of information rate.The Bell System Technical Journal, 35 (4):917–926, 1956

1956
[19]

Adaptive graph-based multiple testing procedures.Pharmaceutical Statistics, 13(6):345–356, 2014

Florian Klinglmüller, Martin Posch, and Franz Koenig. Adaptive graph-based multiple testing procedures.Pharmaceutical Statistics, 13(6):345–356, 2014

2014
[20]

arXiv preprint arXiv:2312.08040 , year=

Nick W Koning. Post-hoc alpha hypothesis testing and the post-hocp-value.arXiv preprint arXiv:2312.08040, 2023

work page arXiv 2023
[21]

Continuous testing: Unifying tests and e-values.arXiv preprint arXiv:2409.05654, 2024

Nick W Koning. Continuous testing: Unifying tests and e-values.arXiv preprint arXiv:2409.05654, 2024

work page arXiv 2024
[22]

Anytime validity is free: inducing sequential tests.Journal of the Royal Statistical Society Series B: Statistical Methodology, page qkag050, 2026

Nick W Koning and Sam Van Meer. Anytime validity is free: inducing sequential tests.Journal of the Royal Statistical Society Series B: Statistical Methodology, page qkag050, 2026

2026
[23]

The numeraire e-variable and reverse information projection.The Annals of Statistics, 53(3):1015–1043, 2025

Martin Larsson, Aaditya Ramdas, and Johannes Ruf. The numeraire e-variable and reverse information projection.The Annals of Statistics, 53(3):1015–1043, 2025

2025
[24]

Atutorial on safe anytime-valid inference: Practical maximally flexible sampling designs for experiments based on e-values.PsyArXiv preprint h5vae_v3, 2024

AlexanderLy, UdoBoehm, PeterGrünwald, AadityaRamdas, andDonvanRavenzwaaij. Atutorial on safe anytime-valid inference: Practical maximally flexible sampling designs for experiments based on e-values.PsyArXiv preprint h5vae_v3, 2024

2024
[25]

Optimal test procedures for multiple hypotheses controlling the familywise expected loss.Biometrics, 79(4):2781–2793, 2023

Willi Maurer, Frank Bretz, and Xiaolei Xun. Optimal test procedures for multiple hypotheses controlling the familywise expected loss.Biometrics, 79(4):2781–2793, 2023

2023
[26]

Adaptive group sequential designs for clinical trials: Combining the advantages of adaptive and of classical group sequential approaches.Biometrics, pages 886–891, 2001

Hans-Helge Müller and Helmut Schäfer. Adaptive group sequential designs for clinical trials: Combining the advantages of adaptive and of classical group sequential approaches.Biometrics, pages 886–891, 2001

2001
[27]

A general statistical principle for changing a design any time during the course of a trial.Statistics in Medicine, 23(16):2497–2508, 2004

Hans-Helge Müller and Helmut Schäfer. A general statistical principle for changing a design any time during the course of a trial.Statistics in Medicine, 23(16):2497–2508, 2004

2004
[28]

Adaptive two stage designs and the conditional error function

Martin Posch and Peter Bauer. Adaptive two stage designs and the conditional error function. Biometrical Journal, pages 689––696, 1999

1999
[29]

A uniform improvement of bonferroni-type tests by sequential tests.Journal of the American Statistical Association, (481):299–308, 2008

Martin Posch and Andreas Futschik. A uniform improvement of bonferroni-type tests by sequential tests.Journal of the American Statistical Association, (481):299–308, 2008. 12

2008
[30]

Type I error rate control in adaptive designs for confirmatory clinical trials with treatment selection at interim.Pharmaceutical Statistics, 10(2): 96–104, 2011

Martin Posch, Willi Maurer, and Frank Bretz. Type I error rate control in adaptive designs for confirmatory clinical trials with treatment selection at interim.Pharmaceutical Statistics, 10(2): 96–104, 2011

2011
[31]

Designed extension of studies based on conditional power.Biometrics, 51(4):1315–1324, 1995

Michael A Proschan and Sally A Hunsberger. Designed extension of studies based on conditional power.Biometrics, 51(4):1315–1324, 1995

1995
[32]

Hypothesis testing with e-values.Foundations and Trends® in Statistics, 1(1-2):1–390, 2025

Aaditya Ramdas and Ruodu Wang. Hypothesis testing with e-values.Foundations and Trends® in Statistics, 1(1-2):1–390, 2025

2025
[33]

Ramdas, J

Aaditya Ramdas, Johannes Ruf, Martin Larsson, and Wouter Koolen. Admissible anytime-valid sequential inference must rely on nonnegative martingales.arXiv preprint arXiv:2009.03167, 2020

work page arXiv 2009
[34]

Testing exchangeability: Fork-convexity, supermartingales and e-processes.International Journal of Approximate Reasoning, 141:83–109, 2022

Aaditya Ramdas, Johannes Ruf, Martin Larsson, and Wouter M Koolen. Testing exchangeability: Fork-convexity, supermartingales and e-processes.International Journal of Approximate Reasoning, 141:83–109, 2022

2022
[35]

Game-theoretic statistics and safe anytime-valid inference.Statistical Science, 38(4):576–601, 2023

Aaditya Ramdas, Peter Grünwald, Vladimir Vovk, and Glenn Shafer. Game-theoretic statistics and safe anytime-valid inference.Statistical Science, 38(4):576–601, 2023

2023
[36]

Modification of the sample size and the schedule of interim analyses in survival trials based on data inspections.Statistics in Medicine, 20:3741–3751, 2001

Helmut Schäfer and Hans-Helge Müller. Modification of the sample size and the schedule of interim analyses in survival trials based on data inspections.Statistics in Medicine, 20:3741–3751, 2001

2001
[37]

Glenn Shafer. Testing by betting: A strategy for statistical and scientific communication.Journal of the Royal Statistical Society Series A: Statistics in Society (with discussion), 184(2):407–431, 2021

2021
[38]

Test martingales, Bayes factors and p-values.Statistical Science, 2011

Glenn Shafer, Alexander Shen, Nikolai Vereshchagin, and Vladimir Vovk. Test martingales, Bayes factors and p-values.Statistical Science, 2011

2011
[39]

Gauthier-Villars Paris, 1939

Jean Ville.Etude critique de la notion de collectif. Gauthier-Villars Paris, 1939

1939
[40]

Testing randomness online.Statistical Science, 36(4):595–611, 2021

Vladimir Vovk. Testing randomness online.Statistical Science, 36(4):595–611, 2021

2021
[41]

E-values: Calibration, combination and applications.The Annals of Statistics, 49(3):1736–1754, 2021

Vladimir Vovk and Ruodu Wang. E-values: Calibration, combination and applications.The Annals of Statistics, 49(3):1736–1754, 2021

2021
[42]

Springer, 2005

Vladimir Vovk, Alexander Gammerman, and Glenn Shafer.Algorithmic learning in a random world, volume 29. Springer, 2005

2005
[43]

Sequential tests of statistical hypotheses.The Annals of Mathematical Statistics, 16(2):117–186, 1945

Abraham Wald. Sequential tests of statistical hypotheses.The Annals of Mathematical Statistics, 16(2):117–186, 1945

1945
[44]

The only admissible way of merging arbitrary e-values.Biometrika, 112:asaf020, 2025

Ruodu Wang. The only admissible way of merging arbitrary e-values.Biometrika, 112:asaf020, 2025

2025
[45]

Universal inference.Proceedings of the National Academy of Sciences, 117(29):16880–16890, 2020

Larry Wasserman, Aaditya Ramdas, and Sivaraman Balakrishnan. Universal inference.Proceedings of the National Academy of Sciences, 117(29):16880–16890, 2020

2020
[46]

Theoretische Konzepte und deren praktische Umsetzung mit SAS

Gernot Wassmer.Statistische Testverfahren für gruppensequentielle und adaptive Pläne in klinis- chen Studien. Theoretische Konzepte und deren praktische Umsetzung mit SAS. Verlag Alexander Mönch, 1999

1999
[47]

Springer, 2nd edition, 2025

Gernot Wassmer and Werner Brannath.Group sequential and confirmatory adaptive designs in clinical trials. Springer, 2nd edition, 2025

2025
[48]

arXiv preprint arXiv:2509.02517 , year=

Ziyu Xu, Aldo Solari, Lasse Fischer, Rianne de Heide, Aaditya Ramdas, and Jelle Goeman. Bringing closure to false discovery rate control: A general principle for multiple testing.arXiv preprint arXiv:2509.02517, 2025. 13 A Exhausting e-value based tests with recursive combination tests It is well known that1{U is not an e-value for a uniformly distributed...

work page arXiv 2025

[1] [1]

Multistage testing with adaptive designs.Biometrie und Informatik in Medizin und Biologie, 20(4):130–148, 1989

Peter Bauer. Multistage testing with adaptive designs.Biometrie und Informatik in Medizin und Biologie, 20(4):130–148, 1989

1989

[2] [2]

Combining different phases in the development of medical treatments within a single trial.Statistics in Medicine, pages 1833–1848, 1999

Peter Bauer and Meinhard Kieser. Combining different phases in the development of medical treatments within a single trial.Statistics in Medicine, pages 1833–1848, 1999

1999

[3] [3]

Evaluation of experiments with adaptive interim analyses.Biometrics, 50:1029–1041, 1994

Peter Bauer and Karl Köhne. Evaluation of experiments with adaptive interim analyses.Biometrics, 50:1029–1041, 1994. (Correction in 1996 Biometrics, 52, 380)

1994

[4] [4]

Yoav Benjamini and Yosef Hochberg. Controlling the false discovery rate: a practical and powerful approach to multiple testing.Journal of the Royal Statistical Society Series B: Statistical Methodology, 57(1):289–300, 1995

1995

[5] [5]

Multiple hypotheses testing with weights.Scandinavian Journal of Statistics, 24(3):407–418, 1997

Yoav Benjamini and Yosef Hochberg. Multiple hypotheses testing with weights.Scandinavian Journal of Statistics, 24(3):407–418, 1997

1997

[6] [6]

Recursive combination tests

Werner Brannath, Martin Posch, and Peter Bauer. Recursive combination tests. pages 236–244, 2002

2002

[7] [7]

Probabilistic foundation of confirmatory adaptive designs.Journal of the American Statistical Association, 107(498):824–832, 2012

Werner Brannath, Georg Gutjahr, and Peter Bauer. Probabilistic foundation of confirmatory adaptive designs.Journal of the American Statistical Association, 107(498):824–832, 2012

2012

[8] [8]

The population-wise error rate for clinical trials with overlapping populations.Statistical Methods in Medical Research, 32(2): 334–352, 2023

Werner Brannath, Charlie Hillner, and Kornelius Rohmeyer. The population-wise error rate for clinical trials with overlapping populations.Statistical Methods in Medical Research, 32(2): 334–352, 2023

2023

[9] [9]

Optimal gambling systems for favourable games

Leo Breiman. Optimal gambling systems for favourable games. InFourth Berkeley Symposium on Mathematical Statistics and Probability, pages 65–78, 1961

1961

[10] [10]

A graphical approach to sequentially rejective multiple test procedures.Statistics in Medicine, 28(4):586–604, 2009

Frank Bretz, Willi Maurer, Werner Brannath, and Martin Posch. A graphical approach to sequentially rejective multiple test procedures.Statistics in Medicine, 28(4):586–604, 2009. 11

2009

[11] [11]

Improving wald’s (approximate) sequential probability ratio test by avoiding overshoot.IEEE Transactions on Information Theory, (4):2457–2471, 2026

Lasse Fischer and Aaditya Ramdas. Improving wald’s (approximate) sequential probability ratio test by avoiding overshoot.IEEE Transactions on Information Theory, (4):2457–2471, 2026

2026

[12] [12]

Safe testing.Journal of the Royal Statistical Society Series B: Statistical Methodology (with discussion), 2024

Peter Grünwald, Rianne de Heide, and Wouter M Koolen. Safe testing.Journal of the Royal Statistical Society Series B: Statistical Methodology (with discussion), 2024

2024

[13] [13]

Beyond neyman–pearson: E-values enable hypothesis testing with a data-driven alpha.Proceedings of the National Academy of Sciences, 121(39):e2302098121, 2024

Peter D Grünwald. Beyond neyman–pearson: E-values enable hypothesis testing with a data-driven alpha.Proceedings of the National Academy of Sciences, 121(39):e2302098121, 2024

2024

[14] [14]

Family-wise Error Rate Control with E-values

Will Hartog and Lihua Lei. Family-wise error rate control with e-values.arXiv preprint arXiv:2501.09015, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025

[15] [15]

Adaptive modifications of hypotheses after an interim analysis.Biometrical Journal, pages 581–589, 2001

Gerhard Hommel. Adaptive modifications of hypotheses after an interim analysis.Biometrical Journal, pages 581–589, 2001

2001

[16] [16]

Powerful short-cuts for multiple testing procedures with special reference to gatekeeping strategies.Statistics in Medicine, pages 4063–73,

Gerhard Hommel, Bretz Frank, and Maurer Willi. Powerful short-cuts for multiple testing procedures with special reference to gatekeeping strategies.Statistics in Medicine, pages 4063–73,

[17] [17]

doi: 10.1002/sim.2873

work page doi:10.1002/sim.2873

[18] [18]

A new interpretation of information rate.The Bell System Technical Journal, 35 (4):917–926, 1956

John L Kelly. A new interpretation of information rate.The Bell System Technical Journal, 35 (4):917–926, 1956

1956

[19] [19]

Adaptive graph-based multiple testing procedures.Pharmaceutical Statistics, 13(6):345–356, 2014

Florian Klinglmüller, Martin Posch, and Franz Koenig. Adaptive graph-based multiple testing procedures.Pharmaceutical Statistics, 13(6):345–356, 2014

2014

[20] [20]

arXiv preprint arXiv:2312.08040 , year=

Nick W Koning. Post-hoc alpha hypothesis testing and the post-hocp-value.arXiv preprint arXiv:2312.08040, 2023

work page arXiv 2023

[21] [21]

Continuous testing: Unifying tests and e-values.arXiv preprint arXiv:2409.05654, 2024

Nick W Koning. Continuous testing: Unifying tests and e-values.arXiv preprint arXiv:2409.05654, 2024

work page arXiv 2024

[22] [22]

Anytime validity is free: inducing sequential tests.Journal of the Royal Statistical Society Series B: Statistical Methodology, page qkag050, 2026

Nick W Koning and Sam Van Meer. Anytime validity is free: inducing sequential tests.Journal of the Royal Statistical Society Series B: Statistical Methodology, page qkag050, 2026

2026

[23] [23]

The numeraire e-variable and reverse information projection.The Annals of Statistics, 53(3):1015–1043, 2025

Martin Larsson, Aaditya Ramdas, and Johannes Ruf. The numeraire e-variable and reverse information projection.The Annals of Statistics, 53(3):1015–1043, 2025

2025

[24] [24]

Atutorial on safe anytime-valid inference: Practical maximally flexible sampling designs for experiments based on e-values.PsyArXiv preprint h5vae_v3, 2024

AlexanderLy, UdoBoehm, PeterGrünwald, AadityaRamdas, andDonvanRavenzwaaij. Atutorial on safe anytime-valid inference: Practical maximally flexible sampling designs for experiments based on e-values.PsyArXiv preprint h5vae_v3, 2024

2024

[25] [25]

Optimal test procedures for multiple hypotheses controlling the familywise expected loss.Biometrics, 79(4):2781–2793, 2023

Willi Maurer, Frank Bretz, and Xiaolei Xun. Optimal test procedures for multiple hypotheses controlling the familywise expected loss.Biometrics, 79(4):2781–2793, 2023

2023

[26] [26]

Adaptive group sequential designs for clinical trials: Combining the advantages of adaptive and of classical group sequential approaches.Biometrics, pages 886–891, 2001

Hans-Helge Müller and Helmut Schäfer. Adaptive group sequential designs for clinical trials: Combining the advantages of adaptive and of classical group sequential approaches.Biometrics, pages 886–891, 2001

2001

[27] [27]

A general statistical principle for changing a design any time during the course of a trial.Statistics in Medicine, 23(16):2497–2508, 2004

Hans-Helge Müller and Helmut Schäfer. A general statistical principle for changing a design any time during the course of a trial.Statistics in Medicine, 23(16):2497–2508, 2004

2004

[28] [28]

Adaptive two stage designs and the conditional error function

Martin Posch and Peter Bauer. Adaptive two stage designs and the conditional error function. Biometrical Journal, pages 689––696, 1999

1999

[29] [29]

A uniform improvement of bonferroni-type tests by sequential tests.Journal of the American Statistical Association, (481):299–308, 2008

Martin Posch and Andreas Futschik. A uniform improvement of bonferroni-type tests by sequential tests.Journal of the American Statistical Association, (481):299–308, 2008. 12

2008

[30] [30]

Type I error rate control in adaptive designs for confirmatory clinical trials with treatment selection at interim.Pharmaceutical Statistics, 10(2): 96–104, 2011

Martin Posch, Willi Maurer, and Frank Bretz. Type I error rate control in adaptive designs for confirmatory clinical trials with treatment selection at interim.Pharmaceutical Statistics, 10(2): 96–104, 2011

2011

[31] [31]

Designed extension of studies based on conditional power.Biometrics, 51(4):1315–1324, 1995

Michael A Proschan and Sally A Hunsberger. Designed extension of studies based on conditional power.Biometrics, 51(4):1315–1324, 1995

1995

[32] [32]

Hypothesis testing with e-values.Foundations and Trends® in Statistics, 1(1-2):1–390, 2025

Aaditya Ramdas and Ruodu Wang. Hypothesis testing with e-values.Foundations and Trends® in Statistics, 1(1-2):1–390, 2025

2025

[33] [33]

Ramdas, J

Aaditya Ramdas, Johannes Ruf, Martin Larsson, and Wouter Koolen. Admissible anytime-valid sequential inference must rely on nonnegative martingales.arXiv preprint arXiv:2009.03167, 2020

work page arXiv 2009

[34] [34]

Testing exchangeability: Fork-convexity, supermartingales and e-processes.International Journal of Approximate Reasoning, 141:83–109, 2022

Aaditya Ramdas, Johannes Ruf, Martin Larsson, and Wouter M Koolen. Testing exchangeability: Fork-convexity, supermartingales and e-processes.International Journal of Approximate Reasoning, 141:83–109, 2022

2022

[35] [35]

Game-theoretic statistics and safe anytime-valid inference.Statistical Science, 38(4):576–601, 2023

Aaditya Ramdas, Peter Grünwald, Vladimir Vovk, and Glenn Shafer. Game-theoretic statistics and safe anytime-valid inference.Statistical Science, 38(4):576–601, 2023

2023

[36] [36]

Modification of the sample size and the schedule of interim analyses in survival trials based on data inspections.Statistics in Medicine, 20:3741–3751, 2001

Helmut Schäfer and Hans-Helge Müller. Modification of the sample size and the schedule of interim analyses in survival trials based on data inspections.Statistics in Medicine, 20:3741–3751, 2001

2001

[37] [37]

Glenn Shafer. Testing by betting: A strategy for statistical and scientific communication.Journal of the Royal Statistical Society Series A: Statistics in Society (with discussion), 184(2):407–431, 2021

2021

[38] [38]

Test martingales, Bayes factors and p-values.Statistical Science, 2011

Glenn Shafer, Alexander Shen, Nikolai Vereshchagin, and Vladimir Vovk. Test martingales, Bayes factors and p-values.Statistical Science, 2011

2011

[39] [39]

Gauthier-Villars Paris, 1939

Jean Ville.Etude critique de la notion de collectif. Gauthier-Villars Paris, 1939

1939

[40] [40]

Testing randomness online.Statistical Science, 36(4):595–611, 2021

Vladimir Vovk. Testing randomness online.Statistical Science, 36(4):595–611, 2021

2021

[41] [41]

E-values: Calibration, combination and applications.The Annals of Statistics, 49(3):1736–1754, 2021

Vladimir Vovk and Ruodu Wang. E-values: Calibration, combination and applications.The Annals of Statistics, 49(3):1736–1754, 2021

2021

[42] [42]

Springer, 2005

Vladimir Vovk, Alexander Gammerman, and Glenn Shafer.Algorithmic learning in a random world, volume 29. Springer, 2005

2005

[43] [43]

Sequential tests of statistical hypotheses.The Annals of Mathematical Statistics, 16(2):117–186, 1945

Abraham Wald. Sequential tests of statistical hypotheses.The Annals of Mathematical Statistics, 16(2):117–186, 1945

1945

[44] [44]

The only admissible way of merging arbitrary e-values.Biometrika, 112:asaf020, 2025

Ruodu Wang. The only admissible way of merging arbitrary e-values.Biometrika, 112:asaf020, 2025

2025

[45] [45]

Universal inference.Proceedings of the National Academy of Sciences, 117(29):16880–16890, 2020

Larry Wasserman, Aaditya Ramdas, and Sivaraman Balakrishnan. Universal inference.Proceedings of the National Academy of Sciences, 117(29):16880–16890, 2020

2020

[46] [46]

Theoretische Konzepte und deren praktische Umsetzung mit SAS

Gernot Wassmer.Statistische Testverfahren für gruppensequentielle und adaptive Pläne in klinis- chen Studien. Theoretische Konzepte und deren praktische Umsetzung mit SAS. Verlag Alexander Mönch, 1999

1999

[47] [47]

Springer, 2nd edition, 2025

Gernot Wassmer and Werner Brannath.Group sequential and confirmatory adaptive designs in clinical trials. Springer, 2nd edition, 2025

2025

[48] [48]

arXiv preprint arXiv:2509.02517 , year=

Ziyu Xu, Aldo Solari, Lasse Fischer, Rianne de Heide, Aaditya Ramdas, and Jelle Goeman. Bringing closure to false discovery rate control: A general principle for multiple testing.arXiv preprint arXiv:2509.02517, 2025. 13 A Exhausting e-value based tests with recursive combination tests It is well known that1{U is not an e-value for a uniformly distributed...

work page arXiv 2025