Everywhere Valid Bounds on False Discovery Proportions in Conformal Inference
Pith reviewed 2026-06-30 17:35 UTC · model grok-4.3
The pith
Finite-sample bounds on false discovery proportions hold simultaneously for all rejection thresholds in conformal inference
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By sampling from the joint distribution of null conformal p-values, the authors construct a high-probability envelope for their empirical distribution function. This envelope yields finite-sample, distribution-free upper bounds on the false discovery proportion that are valid simultaneously over all possible rejection thresholds, thereby permitting arbitrary post hoc selection of the threshold while preserving statistical guarantees.
What carries the argument
High-probability envelope for the empirical distribution function of null conformal p-values, constructed by sampling from their joint distribution
Load-bearing premise
The joint distribution of the null conformal p-values can be sampled while maintaining the distribution-free guarantee.
What would settle it
Repeated simulations where the constructed envelope fails to cover the empirical distribution function of null p-values with the stated high probability would disprove the simultaneous validity.
Figures
read the original abstract
Modern applications of conformal inference to multiple testing problems, such as outlier detection and candidate selection, often involve selecting test samples whose conformal p-values fall below a threshold. The quality of such methods is often measured by the false discovery proportion (FDP), defined as the fraction of incorrect selections. Existing approaches typically control the expected value of the FDP, using methods such as the Benjamini-Hochberg procedure. This approach fails to provide high-probability bounds on the realized false discovery proportion and invalidates statistical guarantees if the rejection threshold is selected after inspecting the data. This paper establishes finite-sample, distribution-free upper bounds on the FDP that hold simultaneously over all possible rejection thresholds, enabling arbitrary post hoc selection of the threshold. Simultaneous validity is achieved by constructing a high-probability envelope for the empirical distribution function of null conformal p-values by sampling from their joint distribution. Furthermore, our framework allows practitioners to modulate the envelope's shape, thereby producing tight bounds in rejection regions of primary interest. We use this flexible approach to derive simultaneous FDP upper bounds for both outlier detection and conformal selection. We demonstrate through synthetic and real-data experiments that the resulting bounds are both valid and substantially less conservative than those derived from existing approaches.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript claims to establish finite-sample, distribution-free upper bounds on the false discovery proportion (FDP) that hold simultaneously over all rejection thresholds in conformal inference for multiple testing. Simultaneous validity is obtained by constructing a high-probability envelope around the empirical distribution function of the null conformal p-values via sampling from their joint distribution; the envelope shape can be modulated for tighter control in regions of interest. The framework is applied to derive bounds for outlier detection and conformal selection, with synthetic and real-data experiments demonstrating validity and reduced conservatism relative to existing methods.
Significance. If the sampling step can be carried out while preserving distribution-freeness, the result would meaningfully advance beyond expectation-based procedures such as Benjamini-Hochberg by supplying high-probability FDP control and post-hoc threshold validity. The ability to shape the envelope is a practical feature. The experiments provide supporting evidence, but the overall significance hinges on a clear, assumption-free construction of the joint sampling procedure.
major comments (2)
- [Abstract (simultaneous validity paragraph)] Abstract, paragraph on simultaneous validity: the envelope is formed 'by sampling from their joint distribution' of null conformal p-values. Under exchangeability the joint law is a function of the unknown data-generating measure; the manuscript must exhibit an auxiliary, distribution-free Monte-Carlo procedure (or equivalent construction) that does not require knowledge of this measure. Without an explicit, verifiable algorithm the finite-sample guarantee is conditional rather than unconditional.
- [Envelope construction (method section)] The central claim of simultaneous, everywhere-valid FDP bounds rests on the envelope construction. If the sampling step implicitly conditions on fitted quantities or additional modeling assumptions not stated in the abstract, the distribution-free property fails. The paper should supply the precise sampling algorithm together with a proof that the resulting envelope probability statement remains valid under the conformal exchangeability assumption alone.
minor comments (2)
- Clarify in the main text the precise mechanism by which the envelope shape is modulated and how the modulation parameter is chosen without data-dependent tuning that would invalidate the simultaneous guarantee.
- In the experimental sections, report the number of Monte-Carlo draws used to approximate the envelope and any convergence diagnostics; this information is needed for reproducibility.
Simulated Author's Rebuttal
We thank the referee for the careful reading and for highlighting the need for an explicit, verifiable description of the sampling procedure. The joint distribution of the null conformal p-values is in fact distribution-free under exchangeability alone (via the uniform permutation model on scores), so the Monte Carlo envelope construction preserves the unconditional finite-sample guarantee. We will revise the manuscript to make the algorithm and its validity proof fully explicit.
read point-by-point responses
-
Referee: [Abstract (simultaneous validity paragraph)] Abstract, paragraph on simultaneous validity: the envelope is formed 'by sampling from their joint distribution' of null conformal p-values. Under exchangeability the joint law is a function of the unknown data-generating measure; the manuscript must exhibit an auxiliary, distribution-free Monte-Carlo procedure (or equivalent construction) that does not require knowledge of this measure. Without an explicit, verifiable algorithm the finite-sample guarantee is conditional rather than unconditional.
Authors: We respectfully note that the premise is incorrect: under the exchangeability assumption the scores are exchangeable, so their relative ordering is uniformly distributed over all permutations independently of the data-generating measure. The conformal p-values are deterministic functions of these ranks; hence their joint law is known and distribution-free. The auxiliary sampling procedure draws Monte Carlo replicates by generating random permutations, computing the induced p-value vectors, and forming the envelope from these replicates. This requires no knowledge of the underlying measure. We will revise the abstract for clarity and add an explicit algorithm box plus a short proof subsection in the methods. revision: yes
-
Referee: [Envelope construction (method section)] The central claim of simultaneous, everywhere-valid FDP bounds rests on the envelope construction. If the sampling step implicitly conditions on fitted quantities or additional modeling assumptions not stated in the abstract, the distribution-free property fails. The paper should supply the precise sampling algorithm together with a proof that the resulting envelope probability statement remains valid under the conformal exchangeability assumption alone.
Authors: The sampling algorithm is the permutation Monte Carlo described above; it conditions on nothing beyond the exchangeability of the calibration and test scores and does not use any fitted model quantities. Because the permutation distribution is exactly the law of the ranks under exchangeability, the high-probability envelope statement holds unconditionally. We will insert the precise algorithm (including pseudocode) and the accompanying validity argument into the methods section of the revised manuscript. revision: yes
Circularity Check
No significant circularity; derivation is constructive and self-contained
full rationale
The paper constructs simultaneous FDP bounds via a high-probability envelope on the empirical CDF of null conformal p-values, obtained by sampling their joint distribution. This is presented as a direct statistical procedure relying on exchangeability properties standard in conformal inference, without any reduction of the claimed bounds to fitted parameters, self-definitions, or load-bearing self-citations within the provided text. No equations or steps equate the output bounds to the inputs by construction. The sampling step is an input assumption of the framework rather than a derived claim that collapses into itself. This is the common case of an independent methodological contribution.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Null conformal p-values admit sampling from their joint distribution
Reference graph
Works this paper leans on
-
[1]
[BJ24] Tian Bai and Ying Jin. Optimized conformal selection: Powerful selective inference after con- formity score optimization.arXiv preprint arXiv:2411.17983,
-
[2]
Acs: An interactive framework for conformal selection.arXiv preprint arXiv:2507.15825,
[GJNR25] Yu Gui, Ying Jin, Yash Nair, and Zhimei Ren. Acs: An interactive framework for conformal selection.arXiv preprint arXiv:2507.15825,
-
[3]
Control of the false discovery proportion for independently tested null hypotheses.Journal of Probability and Statistics, 2012,
[GL+12] Yongchao Ge, Xiaochun Li, et al. Control of the false discovery proportion for independently tested null hypotheses.Journal of Probability and Statistics, 2012,
2012
-
[4]
arXiv preprint arXiv:2307.09291 , year=
[JC23a] Ying Jin and Emmanuel J Cand` es. Model-free selective inference under covariate shift via weighted conformal p-values.arXiv preprint arXiv:2307.09291,
-
[5]
Txcon- formal: Controlling false discoveries in ai-driven therapeutic discovery.bioRxiv, pages 2026–04,
19 [JHD+26] Ying Jin, Kexin Huang, Nathaniel Diamant, Kerry R Buchholz, Steven T Rutherford, Nicholas Skelton, Tommaso Biancalani, Gabriele Scalia, Jure Leskovec, and Emmanuel J Candes. Txcon- formal: Controlling false discoveries in ai-driven therapeutic discovery.bioRxiv, pages 2026–04,
2026
-
[6]
Diversifying conformal selections
[NJYC25] Yash Nair, Ying Jin, James Yang, and Emmanuel Candes. Diversifying conformal selections. arXiv preprint arXiv:2506.16229,
-
[7]
arXiv preprint arXiv:2010.16061 (2020)
[Pow20] David MW Powers. Evaluation: from precision, recall and f-measure to roc, informedness, markedness and correlation.arXiv preprint arXiv:2010.16061,
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.