pith. sign in

arxiv: 2605.24741 · v1 · pith:2O4YIRZ4new · submitted 2026-05-23 · 🧮 math.ST · cs.IT· cs.LG· math.IT· stat.ML· stat.TH

On the Sample Complexity of Robust Binary Hypothesis Testing

Pith reviewed 2026-06-30 11:54 UTC · model grok-4.3

classification 🧮 math.ST cs.ITcs.LGmath.ITstat.MLstat.TH
keywords robust hypothesis testingsample complexitycontamination modelsHuber contaminationtotal variation distancesubtractive contaminationleast favorable distributions
0
0 comments X

The pith

The sample complexities for robust binary hypothesis testing are comparable across Huber, subtractive, and total variation contamination models up to constant rescalings of ε.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper compares the sample complexity of robust binary hypothesis testing under three contamination models: additive Huber, subtractive, and total variation. It proves that these complexities are related by inequalities that rescale the contamination parameter ε by small constant factors, with the constants shown to be tight. It also establishes the existence of least-favourable distributions for the subtractive model and demonstrates that the sample complexity can be unstable to small changes in ε. This matters because it indicates that the specific choice of contamination model has limited impact on the required number of samples, provided ε can be adjusted accordingly.

Core claim

For any fixed δ0 > 0 and all distributions p and q, the sample complexities satisfy n*_Hub(ε) ≲ n*_TV(ε) ≲ n*_Hub(2ε), n*_Sub(ε) ≲ n*_TV(ε) ≲ n*_Sub((2+δ0)ε), and n*_Sub(ε) ≲ n*_Hub(ε) ≲ n*_Sub((1+δ0)ε), with the scaling constants tight. Least favourable distributions exist and are given explicitly for the subtractive model. The sample complexity is unstable under o(ε) perturbations of ε in all models.

What carries the argument

The minimax sample complexity n*(ε) under each of the three contamination models, defined as the smallest n such that there exists a test with error probability at most some fixed level for any contamination within ε.

If this is right

  • Least favourable distributions are explicit for subtractive contamination.
  • Sample complexity increases by polynomial factors for small o(ε) changes in the contamination level.
  • Polynomial factor gaps exist between knowing ε exactly and knowing it up to o(ε) error.
  • The comparability results extend to adaptive versions of the contamination models.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • This comparability suggests that results from one model can be transferred to others by rescaling ε.
  • The instability result implies that accurate estimation of the contamination level is important for determining sample size.
  • Future work could explore whether similar comparability holds for other robust estimation tasks beyond hypothesis testing.

Load-bearing premise

The three contamination models are defined in their standard ways and the sample complexity is the worst-case over all possible contaminations within ε.

What would settle it

A counterexample pair of distributions p and q for which the ratio of sample complexities between any two models cannot be bounded by a constant independent of p and q, or where the factor 2 or (1+δ0) cannot be achieved.

read the original abstract

We study the sample complexity of robust binary hypothesis testing under three standard contamination models: $\varepsilon$-additive (Huber), $\varepsilon$-subtractive, and $\varepsilon$-total variation (TV), denoted by $n^*_{\mathrm{Hub}}(\varepsilon)$, $n^*_{\mathrm{Sub}}(\varepsilon)$, and $n^*_{\mathrm{TV}}(\varepsilon)$, respectively. For subtractive contamination, we show that least favourable distributions exist and provide explicit formulas for the same, bringing this model in line with the classical Huber and TV models. Next we show that in all three models, sample complexity may be highly unstable in the contamination parameter $\varepsilon$, increasing by polynomial factors even for $o(\varepsilon)$ perturbations. Similarly, there may be polynomial factor gaps between the sample complexities when $\varepsilon$ is known exactly versus when it is known up to $o(\varepsilon)$ error. Despite the instability of the sample complexity in all models, we show that the sample complexities across models are comparable up to constant-factor rescaling of $\varepsilon$. Specifically, for any fixed $\delta_0>0$, the following hold for all distributions $p$ and $q$: (i) $n^*_{\mathrm{Hub}}(\varepsilon) \lesssim n^*_{\mathrm{TV}}(\varepsilon) \lesssim n^*_{\mathrm{Hub}}(2\varepsilon)$, (ii) $n^*_{\mathrm{Sub}}(\varepsilon) \lesssim n^*_{\mathrm{TV}}(\varepsilon) \lesssim n^*_{\mathrm{Sub}}((2+\delta_0)\varepsilon)$, and (iii) $n^*_{\mathrm{Sub}}(\varepsilon) \lesssim n^*_{\mathrm{Hub}}(\varepsilon) \lesssim n^*_{\mathrm{Sub}}((1+\delta_0)\varepsilon)$, and the scaling constants are tight. Finally, we extend our results to adaptive versions of the contamination models.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 3 minor

Summary. The paper analyzes the sample complexity n* of robust binary hypothesis testing under three contamination models: ε-Huber (additive), ε-subtractive, and ε-total variation. It proves existence and gives explicit formulas for least-favorable distributions under subtractive contamination. It establishes that n* can be unstable in ε, with polynomial-factor increases under o(ε) perturbations of ε or when ε is known only up to o(ε) error. Despite this, it shows cross-model comparability: for any fixed δ0>0 and all p,q, n*_Hub(ε) ≲ n*_TV(ε) ≲ n*_Hub(2ε), n*_Sub(ε) ≲ n*_TV(ε) ≲ n*_Sub((2+δ0)ε), and n*_Sub(ε) ≲ n*_Hub(ε) ≲ n*_Sub((1+δ0)ε), with the rescaling constants tight; the results extend to adaptive contamination models.

Significance. If the derivations hold, the work unifies the three standard contamination models by aligning the subtractive model with Huber and TV via explicit least-favorable distributions, while quantifying both the sensitivity of n* to ε and the robustness of inter-model comparisons under constant rescalings. The instability results are practically relevant, and the tight constants provide sharp guidance. Credit is due for the explicit constructions and the extension to adaptive settings.

minor comments (3)
  1. [§2] §2 (model definitions): the precise worst-case definition of n*(ε) as a supremum over contaminations is used throughout but is only referenced rather than restated in the main theorems; a one-line reminder would improve readability.
  2. [Theorem 5.2] Theorem 5.2 (comparability): while the upper and lower bounds are stated with explicit constants, the matching lower-bound constructions for tightness are given only for the subtractive-to-Huber direction; a brief note on whether the same technique applies symmetrically would clarify the claim that all listed constants are tight.
  3. [Figure 1] Figure 1 (instability example): the plotted curves use a specific pair (p,q); adding a caption sentence stating that the polynomial degree is distribution-dependent would prevent misinterpretation that the degree is universal.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive evaluation of the manuscript and for recommending acceptance. No major comments were raised in the report.

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper presents a theoretical analysis deriving sample complexity bounds n*_Hub(ε), n*_Sub(ε), and n*_TV(ε) from the standard definitions of the three contamination models and the worst-case notion of n*. It establishes existence and explicit forms for least-favourable distributions in the subtractive model, proves instability under ε-perturbations, and obtains cross-model comparability via explicit rescalings of ε (e.g., n*_Hub(ε) ≲ n*_TV(ε) ≲ n*_Hub(2ε)). All steps rest on direct mathematical arguments relating the radii of the contamination sets; no fitted parameters are renamed as predictions, no load-bearing self-citations reduce the central claims to unverified inputs, and no ansatz or uniqueness result is smuggled in. The derivation is therefore self-contained against the given model definitions.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The paper works entirely within the classical framework of contamination models and information-theoretic sample complexity; it introduces no new free parameters, ad-hoc axioms, or invented entities.

axioms (1)
  • standard math Standard axioms of probability measures and the classical definitions of Huber, subtractive, and TV contamination models.
    The entire analysis is built on these pre-existing model definitions.

pith-pipeline@v0.9.1-grok · 5892 in / 1324 out tokens · 36072 ms · 2026-06-30T11:54:33.865335+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

2 extracted references · 2 canonical work pages · 1 internal anchor

  1. [1]

    Universally instance-optimal mechanisms for private statistical estimation

    [ADHLR24] H. Asi, J. Duchi, S. Haque, Z. Li, and F. Ruan. “Universally instance-optimal mechanisms for private statistical estimation”. In:The Thirty Seventh Annual Conference on Learning Theory. PMLR. 2024, pp. 221–259 (cit. on pp. 6, 44, 45). [AUZ23] H. Asi, J. R. Ullman, and L. Zakynthinou. “From Robustness to Privacy and Back”. In:Proc. 40th Internati...

  2. [2]

    Estimation beyond Missing (Completely) at Random

    Springer, 1986 (cit. on p. 6). [MVBWS24] T. Ma, K. Verchand, T. Berrett, T. Wang, and R. Samworth. “Estimation beyond missing (completely) at random”. In:arXiv preprint arXiv:2410.10704(2024) (cit. on p. 47). [NP33] J. Neyman and E. S. Pearson. “On the Problem of the Most Efficient Tests of Statistical Hypotheses”. In:Philosophical Transactions of the Roy...