pith. machine review for the scientific record. sign in

arxiv: 2604.03218 · v1 · submitted 2026-04-03 · 🧮 math.ST · math.PR· stat.ML· stat.TH

Recognition: 2 theorem links

· Lean Theorem

Power one sequential tests exist for weakly compact mathscr P against mathscr P^c

Authors on Pith no claims yet

Pith reviewed 2026-05-13 18:30 UTC · model grok-4.3

classification 🧮 math.ST math.PRstat.MLstat.TH
keywords sequential testingpower-one testsweak compactnesse-processescomposite hypothesesPolish spacesi.i.d. sampling
0
0 comments X

The pith

A level-α sequential test with power one against the complement exists for any weakly compact null set of distributions.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that when the null hypothesis consists of a weakly compact collection of probability measures on a Polish space, it is possible to construct a sequential test that controls the type I error at level α while achieving power one against every distribution outside the null. This means the test will eventually reject the null with probability one if the data comes from any alternative distribution. A sympathetic reader would care because such tests allow for continuous monitoring of data without inflating error rates, a property impossible for fixed-sample tests. The result holds under i.i.d. sampling without additional restrictions on the distributions beyond weak compactness of the null set. It also shows how to combine these tests into an e-process that grows to infinity under alternatives.

Core claim

For i.i.d. observations in Polish spaces, if the set of null distributions P is weakly compact in the weak topology, then there exists a level-α sequential test that has power one against P^c. Such tests can be aggregated to form an e-process for P that tends to infinity under any alternative in P^c, and an asymptotically relatively growth-rate optimal e-process can be constructed.

What carries the argument

Weak compactness of the null set P in the weak topology on probability measures, which enables a covering or aggregation argument to build a sequential test guaranteeing the power-one property under i.i.d. sampling.

If this is right

  • Level-α sequential tests with power one exist for any weakly compact P.
  • These tests aggregate into an e-process for P that increases to infinity under P^c.
  • An asymptotically relatively growth rate optimal e-process against P^c can be constructed.
  • The power-one property holds for i.i.d. data in Polish spaces with no further restrictions beyond weak compactness.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • When the null set fails to be weakly compact, power-one tests may still exist in special cases even though the paper's sufficient condition no longer applies.
  • The result implies that parametric families whose closure is weakly compact admit anytime-valid sequential tests for continuous data monitoring.
  • e-process constructions derived from these tests extend naturally to anytime-valid inference for composite null hypotheses.
  • Similar compactness arguments might yield power-one tests in certain dependent or non-i.i.d. settings where an appropriate topology preserves compactness.

Load-bearing premise

The null hypothesis set of distributions is weakly compact under the weak topology.

What would settle it

A weakly compact set P in a Polish space for which every level-α sequential test has power strictly less than one against some distribution in P^c.

read the original abstract

Suppose we observe data from a distribution $P$ and we wish to test the composite null hypothesis that $P\in\mathscr P$ against a composite alternative $P\in \mathscr Q\subseteq \mathscr P^c$. Herbert Robbins and coauthors pointed out around 1970 that, while no batch test can have a level $\alpha\in(0,1)$ and power equal to one, sequential tests can be constructed with this fantastic property. Since then, and especially in the last decade, a plethora of sequential tests have been developed for a wide variety of settings. However, the literature has not yet provided a clean and general answer as to when such power-one sequential tests exist. This paper provides a remarkably general sufficient condition (that we also prove is not necessary). Focusing on i.i.d. laws in Polish spaces without any further restriction, we show that there exists a level-$\alpha$ sequential test for any weakly compact $\mathscr P$, that is power-one against $\mathscr P^c$ (or any subset thereof). We show how to aggregate such tests into an $e$-process for $\mathscr P$ that increases to infinity under $\mathscr P^c$. We conclude by building an $e$-process that is asymptotically relatively growth rate optimal against $\mathscr P^c$, an extremely powerful result.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 2 minor

Summary. The manuscript proves that for i.i.d. observations in Polish spaces, any weakly compact collection P of probability measures admits a level-α sequential test with power one against P^c (or any subset). The argument constructs an e-process that is a supermartingale under every law in P and diverges almost surely under any Q ∉ P, relying on weak compactness and Prohorov tightness for uniform type-I control and separation. It further shows how to aggregate such tests into an e-process for P and constructs one that is asymptotically relatively growth-rate optimal against P^c.

Significance. If the central existence claim holds, the result supplies a remarkably general topological sufficient condition (weak compactness in the weak topology) for power-one sequential tests without further restrictions on P. This generalizes classical work of Robbins et al. and recent e-process literature by providing both existence and explicit constructions, including an optimality result on relative growth rates. The paper ships a clean, non-circular argument resting on standard tightness properties rather than data-dependent or self-referential constructions.

minor comments (2)
  1. [§2] §2: the definition of the e-process supermartingale property could be stated with an explicit reference to the filtration to avoid any ambiguity about the optional stopping used later.
  2. [Remark after Theorem 3.2] The statement that weak compactness is 'not necessary' is mentioned in the abstract but the counter-example or necessity argument appears only in a remark; moving a brief sketch to the main text would improve readability.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive assessment of the manuscript, accurate summary of the results, and recommendation to accept. We have no revisions to propose in response to this report.

Circularity Check

0 steps flagged

Existence proof is topologically grounded without circularity

full rationale

The paper establishes an existence result: for i.i.d. observations in Polish spaces, any weakly compact null set P admits a level-α sequential test (equivalently, an e-process) that is a supermartingale under every law in P and diverges almost surely under any Q outside P. The derivation invokes weak compactness to obtain Prohorov tightness, which supplies the uniform integrability and separation needed for both type-I control and power one; these are standard facts from measure-theoretic probability and do not reduce to any fitted parameter, self-definition, or load-bearing self-citation chain. No equation or construction is shown to be equivalent to its own inputs by construction, and the argument remains externally verifiable against classical tightness and martingale theorems.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The result rests on standard measure-theoretic assumptions for Polish spaces and i.i.d. sampling together with the paper-specific topological assumption of weak compactness; no free parameters or invented entities are introduced.

axioms (2)
  • domain assumption Observations are i.i.d. from laws on a Polish space
    Explicitly stated as the setting for the existence result.
  • ad hoc to paper Weak compactness of P is sufficient for existence of power-one tests
    This is the load-bearing condition proved in the paper.

pith-pipeline@v0.9.0 · 5541 in / 1089 out tokens · 42208 ms · 2026-05-13T18:30:06.628879+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

37 extracted references · 37 canonical work pages

  1. [1]

    Shubhada Agrawal and Aaditya Ramdas.On Stopping Times of Power-one Sequential Tests: Tight Lower and Upper Bounds. 2025. arXiv: 2504.19952 [math.ST].url: https://arxiv. org/abs/2504.19952

  2. [2]

    Sebastian Arnold and Eugenio Clerico.Optimal e-values for testing the mean of a bounded random variable against a composite alternative. 2026. arXiv: 2601.11347 [math.ST].url: https://arxiv.org/abs/2601.11347

  3. [3]

    Sequential nonparametric testing with the law of the iterated logarithm

    Akshay Balsubramani and Aaditya Ramdas. “Sequential nonparametric testing with the law of the iterated logarithm”. In:Proceedings of the Thirty-Second Conference on Uncertainty in Artificial Intelligence. UAI’16. AUAI Press, June 2016, pp. 42–51

  4. [4]

    Sequential testing for elicitable functionals via supermartingales

    Philippe Casgrain, Martin Larsson, and Johanna Ziegel. “Sequential testing for elicitable functionals via supermartingales”. In:Bernoulli30.2 (May 2024), pp. 1347–1374.doi: 10. 3150/23-BEJ1634

  5. [5]

    Peeking with PEAK: sequential, nonparametric composite hypothesis tests for means of multiple data streams

    Brian Cho, Kyra Gan, and Nathan Kallus. “Peeking with PEAK: sequential, nonparametric composite hypothesis tests for means of multiple data streams”. In:Proceedings of the 41st International Conference on Machine Learning. ICML ’24. PMLR, July 2024, pp. 8487–8509

  6. [6]

    Sanov Property, Generalized I-Projection and a Conditional Limit Theorem

    Imre Csisz´ ar. “Sanov Property, Generalized I-Projection and a Conditional Limit Theorem”. In:The Annals of Probability12.3 (1984), pp. 768–793.doi: 10.1214/aop/1176993227.url: https://doi.org/10.1214/aop/1176993227

  7. [7]

    Confidence Sequences for Mean, Variance, and Median

    D. A. Darling and Herbert Robbins. “Confidence Sequences for Mean, Variance, and Median”. In:Proceedings of the National Academy of Sciences58.1 (July 1967), pp. 66–68.url: https://www.jstor.org/stable/58172

  8. [8]

    Iterated logarithm inequalities

    D. A. Darling and Herbert Robbins. “Iterated logarithm inequalities”. In:Proceedings of the National Academy of Sciences57.5 (May 1967), pp. 1188–1192.doi: 10.1073/pnas.57.5. 1188

  9. [9]

    Some Nonparametric Sequential Tests with Power One

    D. A. Darling and Herbert Robbins. “Some Nonparametric Sequential Tests with Power One”. In:Proceedings of the National Academy of Sciences61.3 (Nov. 1968), pp. 804–809.url: https://www.jstor.org/stable/58954

  10. [11]

    Continuing horrors of topology without choice

    Chris Good and Ian J. Tree. “Continuing horrors of topology without choice”. In:Topology and its Applications63.1 (1995), pp. 79–90.doi:10.1016/0166-8641(95)90010-1

  11. [12]

    Safe testing

    Peter Gr¨ unwald, Rianne de Heide, and Wouter Koolen. “Safe testing”. In:Journal of the Royal Statistical Society Series B: Statistical Methodology86.5 (Nov. 2024), pp. 1091–1128. doi:10.1093/jrsssb/qkae011. 16

  12. [13]

    A rank-based sequential test of independence

    Alexander Henzi and Michael Law. “A rank-based sequential test of independence”. In: Biometrika111.4 (2024), pp. 1169–1186

  13. [14]

    Time-uniform, nonparametric, nonasymptotic confidence sequences

    Steven R. Howard, Aaditya Ramdas, Jon McAuliffe, and Jasjeet Sekhon. “Time-uniform, nonparametric, nonasymptotic confidence sequences”. In:The Annals of Statistics49.2 (Apr. 2021), pp. 1055–1080.doi:10.1214/20-AOS1176343406

  14. [15]

    Chia-Yu Hsu and Shubhanshu Shekhar.Classifier-Based Nonparametric Sequential Hypothesis Testing. 2026. arXiv:2603.20135 [math.ST].url:https://arxiv.org/abs/2603.20135

  15. [16]

    Mixtures and Products of Dominated Experiments,

    Tze Leung Lai. “On Confidence Sequences”. In:The Annals of Statistics4.2 (Mar. 1976), pp. 265–280.doi:10.1214/aos/1176343406

  16. [17]

    Mixtures and Products of Dominated Experiments,

    Tze Leung Lai. “Power-One Tests Based on Sample Sums”. In:The Annals of Statistics5.5 (Sept. 1977), pp. 866–880.doi:10.1214/aos/1176343943

  17. [18]

    Testing hypotheses generated by constraints

    Martin Larsson, Aaditya Ramdas, and Johannes Ruf. “Testing hypotheses generated by constraints”. In:Mathematics of Operations Research (in print)(2026)

  18. [19]

    The numeraire e-variable and reverse information projection

    Martin Larsson, Aaditya Ramdas, and Johannes Ruf. “The numeraire e-variable and reverse information projection”. In:The Annals of Statistics53.3 (June 2025), pp. 1015–1043.doi: 10.1214/24-AOS2487

  19. [20]

    Martin Larsson, Johannes Ruf, and Aaditya Ramdas.A complete characterization of testable hypotheses. 2026. arXiv: 2601.05217 [math.ST].url: https://arxiv.org/abs/2601.05217

  20. [21]

    Tight concentrations and confidence sequences from the regret of universal portfolio

    Francesco Orabona and Kwang-Sung Jun. “Tight concentrations and confidence sequences from the regret of universal portfolio”. In:IEEE Transactions on Information Theory70.1 (2023), pp. 436–455

  21. [22]

    Deep anytime- valid hypothesis testing

    Teodora Pandeva, Patrick Forr´ e, Aaditya Ramdas, and Shubhanshu Shekhar. “Deep anytime- valid hypothesis testing”. In:Proceedings of the 27th International Conference on Artificial Intelligence and Statistics. AISTATS ’24. PMLR, 2024

  22. [23]

    Sequential predictive two-sample and indepen- dence testing

    Aleksandr Podkopaev and Aaditya Ramdas. “Sequential predictive two-sample and indepen- dence testing”. In:Proceedings of the 37th International Conference on Neural Information Processing Systems. NeurIPS ’23. Dec. 2023, pp. 53275–53307

  23. [24]

    Ashwin Ram and Aaditya Ramdas.Asymptotically optimal sequential change detection for bounded means. 2026. arXiv: 2602.05272 [math.ST] .url: https://arxiv.org/abs/2602. 05272

  24. [25]

    Game-Theoretic Statistics and Safe Anytime-Valid Inference

    Aaditya Ramdas, Peter Gr¨ unwald, Vladimir Vovk, and Glenn Shafer. “Game-Theoretic Statistics and Safe Anytime-Valid Inference”. In:Statistical Science38.4 (Nov. 2023), pp. 576– 601.doi:10.1214/23-STS894

  25. [26]

    Testing exchange- ability: Fork-convexity, supermartingales and e-processes

    Aaditya Ramdas, Johannes Ruf, Martin Larsson, and Wouter M Koolen. “Testing exchange- ability: Fork-convexity, supermartingales and e-processes”. In:International Journal of Ap- proximate Reasoning141 (2022), pp. 83–109

  26. [27]

    Statistical Methods Related to the Law of the Iterated Logarithm

    Herbert Robbins. “Statistical Methods Related to the Law of the Iterated Logarithm”. In: The Annals of Mathematical Statistics41.5 (Oct. 1970), pp. 1397–1409.doi: 10.1214/aoms/ 1177696786

  27. [28]

    Boundary Crossing Probabilities for the Wiener Process and Sample Sums

    Herbert Robbins and David Siegmund. “Boundary Crossing Probabilities for the Wiener Process and Sample Sums”. In:The Annals of Mathematical Statistics41.5 (Oct. 1970), pp. 1410–1429.doi:10.1214/aoms/1177696787. 17

  28. [29]

    The Expected Sample Size of Some Tests of Power One

    Herbert Robbins and David Siegmund. “The Expected Sample Size of Some Tests of Power One”. In:The Annals of Statistics2.3 (May 1974), pp. 415–436.doi: 10 . 1214 / aos / 1176342704

  29. [30]

    A composite generalization of Ville’s martingale theorem using e-processes

    Johannes Ruf, Martin Larsson, Wouter M Koolen, and Aaditya Ramdas. “A composite generalization of Ville’s martingale theorem using e-processes”. In:Electronic Journal of Probability28 (2023), pp. 1–21

  30. [31]

    Alhad Sethi, Kavali Sofia Sagar, Shubhada Agrawal, Debabrota Basu, and P. N. Karthik. Asymptotically Optimal Sequential Testing with Markovian Data. 2026. arXiv: 2602.17587 [math.ST].url:https://arxiv.org/abs/2602.17587

  31. [32]

    Test Martingales, Bayes Factors and p-Values

    Glenn Shafer, Alexander Shen, Nikolai Vereshchagin, and Vladimir Vovk. “Test Martingales, Bayes Factors and p-Values”. In:Statistical Science26.1 (Feb. 2011), pp. 84–101.doi: 10. 1214/10-STS347

  32. [33]

    Shubhanshu Shekhar.Optimal Anytime-Valid Tests for Composite Nulls. 2025. arXiv: 2512. 20039 [math.ST].url:https://arxiv.org/abs/2512.20039

  33. [34]

    Nonparametric Two-Sample Testing by Betting

    Shubhanshu Shekhar and Aaditya Ramdas. “Nonparametric Two-Sample Testing by Betting”. In:IEEE Transactions on Information Theory70.2 (Feb. 2024), pp. 1178–1203.doi: 10.1109/ TIT.2023.3305867

  34. [35]

    Arthur Seebach.Counterexamples in Topology

    Lynn Arthur Steen and J. Arthur Seebach.Counterexamples in Topology. 2nd. New York: Springer-Verlag, 1978

  35. [36]

    Testing randomness online

    Vladimir Vovk. “Testing randomness online”. In:Statistical Science36.4 (2021), pp. 595–611

  36. [37]

    Sequential Tests of Statistical Hypotheses

    Abraham Wald. “Sequential Tests of Statistical Hypotheses”. In:The Annals of Mathematical Statistics16.2 (1945), pp. 117–186.doi:10.1214/aoms/1177731118

  37. [38]

    Estimating means of bounded random variables by betting

    Ian Waudby-Smith and Aaditya Ramdas. “Estimating means of bounded random variables by betting”. In:Journal of the Royal Statistical Society Series B: Statistical Methodology86.1 (2024), pp. 1–27. 18