On the Sample Complexity of Robust Binary Hypothesis Testing

Ankit Pensia; Shankar Vallinayagam; Varun Jog

arxiv: 2605.24741 · v1 · pith:2O4YIRZ4new · submitted 2026-05-23 · 🧮 math.ST · cs.IT· cs.LG· math.IT· stat.ML· stat.TH

On the Sample Complexity of Robust Binary Hypothesis Testing

Shankar Vallinayagam , Ankit Pensia , Varun Jog This is my paper

Pith reviewed 2026-06-30 11:54 UTC · model grok-4.3

classification 🧮 math.ST cs.ITcs.LGmath.ITstat.MLstat.TH

keywords robust hypothesis testingsample complexitycontamination modelsHuber contaminationtotal variation distancesubtractive contaminationleast favorable distributions

0 comments

The pith

The sample complexities for robust binary hypothesis testing are comparable across Huber, subtractive, and total variation contamination models up to constant rescalings of ε.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper compares the sample complexity of robust binary hypothesis testing under three contamination models: additive Huber, subtractive, and total variation. It proves that these complexities are related by inequalities that rescale the contamination parameter ε by small constant factors, with the constants shown to be tight. It also establishes the existence of least-favourable distributions for the subtractive model and demonstrates that the sample complexity can be unstable to small changes in ε. This matters because it indicates that the specific choice of contamination model has limited impact on the required number of samples, provided ε can be adjusted accordingly.

Core claim

For any fixed δ0 > 0 and all distributions p and q, the sample complexities satisfy n*_Hub(ε) ≲ n*_TV(ε) ≲ n*_Hub(2ε), n*_Sub(ε) ≲ n*_TV(ε) ≲ n*_Sub((2+δ0)ε), and n*_Sub(ε) ≲ n*_Hub(ε) ≲ n*_Sub((1+δ0)ε), with the scaling constants tight. Least favourable distributions exist and are given explicitly for the subtractive model. The sample complexity is unstable under o(ε) perturbations of ε in all models.

What carries the argument

The minimax sample complexity n*(ε) under each of the three contamination models, defined as the smallest n such that there exists a test with error probability at most some fixed level for any contamination within ε.

If this is right

Least favourable distributions are explicit for subtractive contamination.
Sample complexity increases by polynomial factors for small o(ε) changes in the contamination level.
Polynomial factor gaps exist between knowing ε exactly and knowing it up to o(ε) error.
The comparability results extend to adaptive versions of the contamination models.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

This comparability suggests that results from one model can be transferred to others by rescaling ε.
The instability result implies that accurate estimation of the contamination level is important for determining sample size.
Future work could explore whether similar comparability holds for other robust estimation tasks beyond hypothesis testing.

Load-bearing premise

The three contamination models are defined in their standard ways and the sample complexity is the worst-case over all possible contaminations within ε.

What would settle it

A counterexample pair of distributions p and q for which the ratio of sample complexities between any two models cannot be bounded by a constant independent of p and q, or where the factor 2 or (1+δ0) cannot be achieved.

read the original abstract

We study the sample complexity of robust binary hypothesis testing under three standard contamination models: $\varepsilon$-additive (Huber), $\varepsilon$-subtractive, and $\varepsilon$-total variation (TV), denoted by $n^*_{\mathrm{Hub}}(\varepsilon)$, $n^*_{\mathrm{Sub}}(\varepsilon)$, and $n^*_{\mathrm{TV}}(\varepsilon)$, respectively. For subtractive contamination, we show that least favourable distributions exist and provide explicit formulas for the same, bringing this model in line with the classical Huber and TV models. Next we show that in all three models, sample complexity may be highly unstable in the contamination parameter $\varepsilon$, increasing by polynomial factors even for $o(\varepsilon)$ perturbations. Similarly, there may be polynomial factor gaps between the sample complexities when $\varepsilon$ is known exactly versus when it is known up to $o(\varepsilon)$ error. Despite the instability of the sample complexity in all models, we show that the sample complexities across models are comparable up to constant-factor rescaling of $\varepsilon$. Specifically, for any fixed $\delta_0>0$, the following hold for all distributions $p$ and $q$: (i) $n^*_{\mathrm{Hub}}(\varepsilon) \lesssim n^*_{\mathrm{TV}}(\varepsilon) \lesssim n^*_{\mathrm{Hub}}(2\varepsilon)$, (ii) $n^*_{\mathrm{Sub}}(\varepsilon) \lesssim n^*_{\mathrm{TV}}(\varepsilon) \lesssim n^*_{\mathrm{Sub}}((2+\delta_0)\varepsilon)$, and (iii) $n^*_{\mathrm{Sub}}(\varepsilon) \lesssim n^*_{\mathrm{Hub}}(\varepsilon) \lesssim n^*_{\mathrm{Sub}}((1+\delta_0)\varepsilon)$, and the scaling constants are tight. Finally, we extend our results to adaptive versions of the contamination models.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper supplies explicit least-favorable distributions for subtractive contamination and proves tight comparability of sample complexities across the three models after small rescalings of epsilon, plus instability to o(epsilon) changes.

read the letter

The main things to know are that this work gives closed-form least-favorable distributions for the subtractive model and shows that the worst-case sample complexities satisfy n*_Hub(ε) ≲ n*_TV(ε) ≲ n*_Hub(2ε) and the two other pairwise relations with (1+δ0) or (2+δ0) factors, all with tight constants. It also documents polynomial instability when epsilon is perturbed by o(epsilon) or known only approximately.

What stands out is the explicit construction for subtractive contamination; that brings the model onto the same footing as Huber and TV, where such distributions were already known. The comparability statements follow from relating the radii of the contamination sets, and the paper checks that the scaling factors cannot be improved in general. The instability examples are concrete and show the phenomenon is not an artifact of one model.

The soft spots are modest. The proofs are not reproduced in the abstract, so one still needs to check that the least-favorable distributions are indeed optimal and that the adaptive-model extension does not introduce hidden looseness. The instability results depend on the precise worst-case definition over contaminations; if that definition is relaxed the gaps may shrink, but the paper is clear about the setting it uses. No circularity or post-hoc fitting appears.

This is a paper for people already working on robust hypothesis testing or contamination models. A reader who cares about sample-complexity bounds under different neighborhoods will get usable statements and explicit constructions. It is worth sending to a serious referee: the claims are specific, the constructions are new relative to the cited literature, and the comparability results are the sort of clean relation that can be checked or used downstream.

Referee Report

0 major / 3 minor

Summary. The paper analyzes the sample complexity n* of robust binary hypothesis testing under three contamination models: ε-Huber (additive), ε-subtractive, and ε-total variation. It proves existence and gives explicit formulas for least-favorable distributions under subtractive contamination. It establishes that n* can be unstable in ε, with polynomial-factor increases under o(ε) perturbations of ε or when ε is known only up to o(ε) error. Despite this, it shows cross-model comparability: for any fixed δ0>0 and all p,q, n*_Hub(ε) ≲ n*_TV(ε) ≲ n*_Hub(2ε), n*_Sub(ε) ≲ n*_TV(ε) ≲ n*_Sub((2+δ0)ε), and n*_Sub(ε) ≲ n*_Hub(ε) ≲ n*_Sub((1+δ0)ε), with the rescaling constants tight; the results extend to adaptive contamination models.

Significance. If the derivations hold, the work unifies the three standard contamination models by aligning the subtractive model with Huber and TV via explicit least-favorable distributions, while quantifying both the sensitivity of n* to ε and the robustness of inter-model comparisons under constant rescalings. The instability results are practically relevant, and the tight constants provide sharp guidance. Credit is due for the explicit constructions and the extension to adaptive settings.

minor comments (3)

[§2] §2 (model definitions): the precise worst-case definition of n*(ε) as a supremum over contaminations is used throughout but is only referenced rather than restated in the main theorems; a one-line reminder would improve readability.
[Theorem 5.2] Theorem 5.2 (comparability): while the upper and lower bounds are stated with explicit constants, the matching lower-bound constructions for tightness are given only for the subtractive-to-Huber direction; a brief note on whether the same technique applies symmetrically would clarify the claim that all listed constants are tight.
[Figure 1] Figure 1 (instability example): the plotted curves use a specific pair (p,q); adding a caption sentence stating that the polynomial degree is distribution-dependent would prevent misinterpretation that the degree is universal.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive evaluation of the manuscript and for recommending acceptance. No major comments were raised in the report.

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper presents a theoretical analysis deriving sample complexity bounds n*_Hub(ε), n*_Sub(ε), and n*_TV(ε) from the standard definitions of the three contamination models and the worst-case notion of n*. It establishes existence and explicit forms for least-favourable distributions in the subtractive model, proves instability under ε-perturbations, and obtains cross-model comparability via explicit rescalings of ε (e.g., n*_Hub(ε) ≲ n*_TV(ε) ≲ n*_Hub(2ε)). All steps rest on direct mathematical arguments relating the radii of the contamination sets; no fitted parameters are renamed as predictions, no load-bearing self-citations reduce the central claims to unverified inputs, and no ansatz or uniqueness result is smuggled in. The derivation is therefore self-contained against the given model definitions.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The paper works entirely within the classical framework of contamination models and information-theoretic sample complexity; it introduces no new free parameters, ad-hoc axioms, or invented entities.

axioms (1)

standard math Standard axioms of probability measures and the classical definitions of Huber, subtractive, and TV contamination models.
The entire analysis is built on these pre-existing model definitions.

pith-pipeline@v0.9.1-grok · 5892 in / 1324 out tokens · 36072 ms · 2026-06-30T11:54:33.865335+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

2 extracted references · 2 canonical work pages · 1 internal anchor

[1]

Universally instance-optimal mechanisms for private statistical estimation

[ADHLR24] H. Asi, J. Duchi, S. Haque, Z. Li, and F. Ruan. “Universally instance-optimal mechanisms for private statistical estimation”. In:The Thirty Seventh Annual Conference on Learning Theory. PMLR. 2024, pp. 221–259 (cit. on pp. 6, 44, 45). [AUZ23] H. Asi, J. R. Ullman, and L. Zakynthinou. “From Robustness to Privacy and Back”. In:Proc. 40th Internati...

work page arXiv 2024
[2]

Estimation beyond Missing (Completely) at Random

Springer, 1986 (cit. on p. 6). [MVBWS24] T. Ma, K. Verchand, T. Berrett, T. Wang, and R. Samworth. “Estimation beyond missing (completely) at random”. In:arXiv preprint arXiv:2410.10704(2024) (cit. on p. 47). [NP33] J. Neyman and E. S. Pearson. “On the Problem of the Most Efficient Tests of Statistical Hypotheses”. In:Philosophical Transactions of the Roy...

work page internal anchor Pith review Pith/arXiv arXiv 1986

[1] [1]

Universally instance-optimal mechanisms for private statistical estimation

[ADHLR24] H. Asi, J. Duchi, S. Haque, Z. Li, and F. Ruan. “Universally instance-optimal mechanisms for private statistical estimation”. In:The Thirty Seventh Annual Conference on Learning Theory. PMLR. 2024, pp. 221–259 (cit. on pp. 6, 44, 45). [AUZ23] H. Asi, J. R. Ullman, and L. Zakynthinou. “From Robustness to Privacy and Back”. In:Proc. 40th Internati...

work page arXiv 2024

[2] [2]

Estimation beyond Missing (Completely) at Random

Springer, 1986 (cit. on p. 6). [MVBWS24] T. Ma, K. Verchand, T. Berrett, T. Wang, and R. Samworth. “Estimation beyond missing (completely) at random”. In:arXiv preprint arXiv:2410.10704(2024) (cit. on p. 47). [NP33] J. Neyman and E. S. Pearson. “On the Problem of the Most Efficient Tests of Statistical Hypotheses”. In:Philosophical Transactions of the Roy...

work page internal anchor Pith review Pith/arXiv arXiv 1986