pith. sign in

arxiv: 2509.03297 · v2 · submitted 2025-09-03 · 📊 stat.ME · stat.ML

Feedback-Enhanced Online Multiple Testing with Applications to Conformal Selection

Pith reviewed 2026-05-18 19:31 UTC · model grok-4.3

classification 📊 stat.ME stat.ML
keywords online multiple testingfalse discovery ratefeedbackalpha investingconformal selectionsequential decision making
0
0 comments X

The pith

GAIF uses revealed hypothesis outcomes to dynamically adjust testing thresholds while preserving finite-sample FDR control.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces GAIF, a feedback-enhanced version of generalized alpha-investing for online multiple testing. In this setting decisions on hypotheses are made one after another, and the true status of each hypothesis becomes known afterward, either right away or after some delay. GAIF feeds those revealed outcomes back into the threshold-setting rule so that future decisions can be made more aggressively without breaking the finite-sample false discovery rate guarantee. The same feedback idea is then applied to online conformal selection: independent conformal p-values are built, and a feedback-driven rule picks the strongest scoring model or function at each step to raise power.

Core claim

GAIF is a generalized alpha-investing procedure that receives the true label of each tested hypothesis after the rejection decision has been issued. It uses the observed label to update the remaining alpha budget for all future tests, thereby producing data-dependent thresholds that still guarantee finite-sample FDR or marginal FDR control. When the same feedback loop is attached to a stream of conformal p-values, a model-selection step chooses the score function that has performed best on the already-revealed labels, which increases the number of discoveries while the FDR bound remains intact.

What carries the argument

GAIF, the feedback-enhanced generalized alpha-investing rule that updates the alpha investment level after each revealed outcome and then sets the next rejection threshold from the updated budget.

If this is right

  • Sequential testing procedures can now incorporate delayed outcome information without sacrificing exact finite-sample error control.
  • Conformal selection gains an automatic way to switch between candidate score functions using only the revealed labels.
  • The same feedback mechanism applies to any alpha-investing scheme, not only to the generalized version presented here.
  • Power gains are largest when the revealed outcomes are informative about the remaining hypotheses.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The method could be paired with bandit-style allocation rules that decide which hypotheses to test next on the basis of past feedback.
  • In streaming data settings the procedure might be run with a sliding window on the revealed labels to adapt to distribution drift.
  • Because conformal p-values are constructed to be independent of the selection rule, the feedback-driven model choice remains valid under the same marginal FDR bound.

Load-bearing premise

After each decision the true state of the hypothesis is revealed and can be fed back into the threshold rule without destroying the finite-sample FDR guarantee.

What would settle it

Run GAIF on a stream of hypotheses whose true labels are generated adversarially after each decision; check whether the realized proportion of false discoveries exceeds the nominal FDR level at any finite horizon.

Figures

Figures reproduced from arXiv: 2509.03297 by Changliang Zou, Haojie Ren, Lin Lu, Yuyang Huo, Zhaojun Wang.

Figure 1
Figure 1. Figure 1: depicts the testing thresholds {αt} over time t for various procedures applied to Gaussian observations. It is clear that our methods yield larger thresholds after improving the gap via feedback, with α SF t > αSAFFRON t and α LF t > αLORD++ t in average. This illustrates that the GAIF framework leverages alpha-wealth more effectively through feedback, thereby achieving higher power than the traditional GA… view at source ↗
Figure 2
Figure 2. Figure 2: Results for Scenario I and Scenario II. Line charts of FDR and Power at stopping time with varying non-null proportion π1 from 0.1 to 0.8 after 500 replications; The black dashed lines denote the FDR level α = 0.1. Shaded areas show ±1 standard error. 0.0 0.1 0.2 0.3 0.4 0.5 0.2 0.4 0.6 0.8 π1 FDR 0.00 0.25 0.50 0.75 1.00 0.2 0.4 0.6 0.8 π1 Power Method SFdep SF LFdep LF SAFFRONdep SAFFRON LORDdep LORD++ L… view at source ↗
Figure 3
Figure 3. Figure 3: Results for Scenario III: FDR and power at stopping time across 500 replications with non-null proportion π1 ranging from 0.1 to 0.8. The black dashed line indicates the target FDR level α = 0.1. Shaded areas show ±1 standard error. • Scenario IV: Data is generated as X | Y = 0 ∼ N4 (µ1, I4), and X | Y = 1 ∼ N4 (µ2, I4), where µ1 = (1, 0, 0, 0)⊤, µ2 = (0, 0, −2, −2)⊤. The target region is A = {1}. We set t… view at source ↗
Figure 4
Figure 4. Figure 4: reports the online FDR and power at the stopping time T under Scenario IV across varying non-null proportions π1 ∈ [0.1, 0.8]. All methods control the FDR below the nominal level α, with SF aligning most closely with the target level among all competitors. In terms of power, as expected, SF consistently achieves the highest power, while LF also performs competitively and attains higher power than SAFFRON a… view at source ↗
Figure 5
Figure 5. Figure 5: Results for Scenario V (sine pattern shifts): the values of FDR(t) and Power(t) across different time t. The black dashed lines denote the FDR level α = 0.1. Shaded areas show ±1 standard error. 5 Real Data Applications In this section, we evaluate our proposed methods on four real-world datasets, illustrating their practical benefits in diverse online decision-making tasks. • Task 1: Online Candidate Scre… view at source ↗
Figure 6
Figure 6. Figure 6: Results for real-data applications: the values of FDR(δ t ) and Power(δ t ) over time t for six benchmarks. The black dashed lines indicate the FDR level α = 0.3. Shaded areas show ±1 standard error [PITH_FULL_IMAGE:figures/full_fig_p024_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: The FDR and Power for SAFFRON at stopping time 600 under different λ value for target FDR level α = 0.1. The red lines denote the results for the variant of feedback method. D Extensions of GAIF based on e-values Although feedback cannot be directly used to improve e-LOND (Xu and Ramdas, 2024), it can enhance e-LORD and e-SAFFRON (Zhang et al., 2025) through a feedback-driven 50 [PITH_FULL_IMAGE:figures/f… view at source ↗
Figure 8
Figure 8. Figure 8: Results for Scenario VI: values of FDR(T) and Power(T) at stopping time T across different non-null proportions π1. The black dashed line denotes the FDR level α = 0.2. The results for Scenarios IV and VI under different training algorithms—RF, SVM, and NN—with varying initial calibration sizes are presented in [PITH_FULL_IMAGE:figures/full_fig_p054_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Results for Scenario IV: FDR(T) and Power(T) vs. initial calibration size n (π1 = 0.5, α = 0.2). RF SVM NN FDR Power 200 400 600 200 400 600 200 400 600 0.0 0.1 0.2 0.00 0.25 0.50 0.75 n Method SF LF SFS LFS SAFFRON LORD++ LOND [PITH_FULL_IMAGE:figures/full_fig_p055_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Results for Scenario VI: FDR(T) and Power(T) vs. initial calibration size n (π1 = 0.5, α = 0.2). The results with model selection for Scenarios IV and VI are shown below [PITH_FULL_IMAGE:figures/full_fig_p055_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Results for Scenario IV: the values of FDR(T) and Power(T) at stopping time T across different non-null proportion π1. The black dashed lines denote the FDR level α = 0.1. 0.00 0.25 0.50 0.75 1.00 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 π1 FDR 0.00 0.25 0.50 0.75 1.00 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 π1 Power Method Opt−LF Ran−LF Opt−LFS Ran−LFS LORD++ 0.00 0.25 0.50 0.75 1.00 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 π1 FD… view at source ↗
Figure 12
Figure 12. Figure 12: Results for Scenario VI: the values of FDR(T) and Power(T) at stopping time T across different non-null proportion π1. The black dashed lines denote the FDR level α = 0.1. 56 [PITH_FULL_IMAGE:figures/full_fig_p056_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Results for Scenario III (local dependence): Line charts of mFDR and FDR at stopping time with varying non-null proportion π1 from 0.1 to 0.8. The black dashed lines denote the target FDR level α = 0.1. 0.0 0.1 0.2 0.3 0.4 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 π1 mFDR Scenario IV 0.0 0.1 0.2 0.3 0.4 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 π1 mFDR Scenario VI 0.0 0.1 0.2 0.3 0.4 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 π1 FDR 0.… view at source ↗
Figure 14
Figure 14. Figure 14: Results for Scenario IV and Scenario VI : Line charts of mFDR and FDR at stopping time with varying non-null proportion π1 from 0.1 to 0.8 after 500 replications; The black dashed lines denote the target FDR level α = 0.2. 57 [PITH_FULL_IMAGE:figures/full_fig_p057_14.png] view at source ↗
read the original abstract

We study online multiple testing with feedback, where decisions are made sequentially and the true state of the hypothesis is revealed after the decision has been made, either instantly or with a delay. We propose GAIF, a feedback-enhanced generalized alpha-investing framework that dynamically adjusts thresholds using revealed outcomes, ensuring finite-sample false discovery rate (FDR)/marginal FDR control. Extending GAIF to online conformal testing, we construct independent conformal $p$-values and introduce a feedback-driven model selection criterion to identify the best model/score, thereby improving statistical power. We demonstrate the effectiveness of our methods through numerical simulations and real-data applications.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper proposes GAIF, a feedback-enhanced generalized alpha-investing procedure for online multiple testing in which the true state of each hypothesis is revealed after the decision (instantly or with delay). It claims that GAIF dynamically updates thresholds using these revelations while preserving finite-sample FDR and marginal FDR control. The method is extended to online conformal testing by constructing independent conformal p-values and introducing a feedback-driven model selection step that improves power. Numerical simulations and real-data examples are provided to illustrate gains over non-feedback baselines.

Significance. If the finite-sample control result holds under delayed feedback, the work would meaningfully extend alpha-investing ideas to realistic sequential settings with post-decision revelations, offering a practical route to higher power in conformal selection and related online testing problems. The conformal extension and empirical demonstrations are clear strengths.

major comments (2)
  1. [§4.1, Theorem 2] §4.1, Theorem 2 (FDR control under delay): the supermartingale argument for the alpha-wealth process is stated for the immediate-revelation filtration; the extension to delayed feedback requires an explicit re-derivation showing that the wealth increment remains a supermartingale when the revelation for hypothesis i arrives after decisions for some j > i have already been made. Without this step the finite-sample bound does not automatically carry over.
  2. [§5.3] §5.3, conformal p-value construction: the independence claim for the conformal p-values under the feedback-driven model selection criterion is not accompanied by a precise statement of the filtration or conditioning that prevents the selection step from introducing dependence between the p-value for the current hypothesis and the feedback used to choose the score function.
minor comments (2)
  1. [§3] Notation for the delayed revelation time τ_i is introduced in §3 but used inconsistently in the algorithm pseudocode; a single consistent definition would improve readability.
  2. [Figure 3] Figure 3 caption does not specify the delay distribution used in the simulation; adding this detail would allow readers to reproduce the power curves.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their careful reading and constructive comments. We address each major comment below and indicate the revisions we will incorporate.

read point-by-point responses
  1. Referee: [§4.1, Theorem 2] §4.1, Theorem 2 (FDR control under delay): the supermartingale argument for the alpha-wealth process is stated for the immediate-revelation filtration; the extension to delayed feedback requires an explicit re-derivation showing that the wealth increment remains a supermartingale when the revelation for hypothesis i arrives after decisions for some j > i have already been made. Without this step the finite-sample bound does not automatically carry over.

    Authors: We agree that the supermartingale argument as currently written is developed explicitly under immediate revelation. In the revision we will add a dedicated subsection that re-derives the supermartingale property under delayed feedback. The argument will use a filtration that includes all decisions made before the delayed revelation arrives and will verify that the wealth increment remains a supermartingale with respect to this filtration, thereby preserving the finite-sample FDR bound. revision: yes

  2. Referee: [§5.3] §5.3, conformal p-value construction: the independence claim for the conformal p-values under the feedback-driven model selection criterion is not accompanied by a precise statement of the filtration or conditioning that prevents the selection step from introducing dependence between the p-value for the current hypothesis and the feedback used to choose the score function.

    Authors: We will expand Section 5.3 with an explicit statement of the filtration and the conditioning argument. The model-selection step is measurable with respect to the sigma-field generated by all previous revelations and decisions; the conformal p-value for the current hypothesis is then constructed from a score function chosen conditionally on that sigma-field. Because the conformal scores remain exchangeable under the null conditionally on the selected model, the resulting p-value is independent of the selection step and valid for the online procedure. revision: yes

Circularity Check

0 steps flagged

No significant circularity; GAIF extends alpha-investing with independent feedback control derivation

full rationale

The paper presents GAIF as an extension of generalized alpha-investing that incorporates revealed outcomes (instant or delayed) to dynamically adjust thresholds while preserving finite-sample FDR/mFDR control. This control is derived from the supermartingale property of the alpha-wealth process under the feedback filtration, building on but not reducing to prior alpha-investing results. No step equates the claimed guarantee to a fitted parameter or self-citation by construction; the feedback update rule and conformal p-value construction introduce new elements that are explicitly re-derived for the delayed case. The framework remains self-contained against external benchmarks for FDR control.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The framework relies on standard definitions of FDR and marginal FDR plus the assumption that feedback provides accurate ground truth. No new free parameters or invented entities are explicitly introduced in the abstract.

axioms (1)
  • domain assumption Revealed outcomes after each decision are accurate and can be used for threshold updates without breaking finite-sample FDR control.
    This is the core setting that enables the feedback mechanism described in the abstract.

pith-pipeline@v0.9.0 · 5636 in / 1072 out tokens · 28257 ms · 2026-05-18T19:31:33.275028+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Beyond Fixed False Discovery Rates: Post-Hoc Conformal Selection with E-Variables

    cs.LG 2026-04 unverdicted novelty 7.0

    Post-hoc conformal selection creates a path of selection sets with estimated false discovery proportions, enabling data-driven adaptive FDR control with average reliability guarantees via e-variables and e-BH.

Reference graph

Works this paper leans on

47 extracted references · 47 canonical work pages · cited by 1 Pith paper · 2 internal anchors

  1. [1]

    Generalized -investing: definitions, optimality results and application to public databases

    Ehud Aharoni and Saharon Rosset. Generalized -investing: definitions, optimality results and application to public databases. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 76 0 (4): 0 771--794, 2014

  2. [2]

    Theoretical Foundations of Conformal Prediction

    Anastasios N Angelopoulos, Rina Foygel Barber, and Stephen Bates. Theoretical foundations of conformal prediction. arXiv preprint arXiv:2411.11824, 2024

  3. [3]

    Optimized conformal selection: Powerful selective inference after conformity score optimization

    Tian Bai and Ying Jin. Optimized conformal selection: Powerful selective inference after conformity score optimization. arXiv preprint arXiv:2411.17983, 2024

  4. [4]

    Testing for outliers with conformal p-values

    Stephen Bates, Emmanuel Cand \`e s, Lihua Lei, Yaniv Romano, and Matteo Sesia. Testing for outliers with conformal p-values. The Annals of Statistics, 51 0 (1): 0 149--178, 2023

  5. [5]

    Adult income investigation

    Barry Becker and Ronny Kohavi. Adult income investigation . UCI Machine Learning Repository https://archive-beta.ics.uci.edu/dataset/2/adult, 1996

  6. [6]

    Controlling the false discovery rate: a practical and powerful approach to multiple testing

    Yoav Benjamini and Yosef Hochberg. Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 57 0 (1): 0 289--300, 1995

  7. [7]

    Airfoil Self-Noise

    Thomas Brooks, D Pope, and Michael Marcolini. Airfoil Self-Noise . UCI Machine Learning Repository https://archive.ics.uci.edu/dataset/291/airfoil+self+noise, 2014

  8. [8]

    On-line consistent ranking on e-recruitment: seeking the truth behind a well-formed cv

    Evanthia Faliagka, Lazaros Iliadis, Ioannis Karydis, Maria Rigou, Spyros Sioutas, Athanasios Tsakalidis, and Giannis Tzimas. On-line consistent ranking on e-recruitment: seeking the truth behind a well-formed cv. Artificial Intelligence Review, 42 0 (3): 0 515--528, 2014

  9. [9]

    Online generalizations of the e-BH and BH procedure

    Lasse Fischer, Ziyu Xu, and Aaditya Ramdas. Online generalizations of the e-BH and BH procedure. arXiv preprint arXiv:2407.20683, 2024

  10. [10]

    Online false discovery rate control for LORD++ and SAFFRON under positive, local dependence

    Aaron Fisher. Online false discovery rate control for LORD++ and SAFFRON under positive, local dependence. Biometrical Journal, 66 0 (1): 0 2300177, 2024

  11. [11]

    -investing: a procedure for sequential control of expected false discoveries

    Dean P Foster and Robert A Stine. -investing: a procedure for sequential control of expected false discoveries. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 70 0 (2): 0 429--444, 2008

  12. [12]

    Structure--adaptive sequential testing for online false discovery rate control

    Bowen Gang, Wenguang Sun, and Weinan Wang. Structure--adaptive sequential testing for online false discovery rate control. Journal of the American Statistical Association, pages 1--14, 2021

  13. [13]

    Conformal online model aggregation

    Matteo Gasparin and Aaditya Ramdas. Conformal online model aggregation. arXiv preprint arXiv:2403.15527, 2024

  14. [14]

    Adaptive conformal inference under distribution shift

    Isaac Gibbs and Emmanuel Cand \`e s. Adaptive conformal inference under distribution shift. Advances in Neural Information Processing Systems, 34: 0 1660--1672, 2021

  15. [15]

    Conformal inference for online prediction with arbitrary distribution shifts

    Isaac Gibbs and Emmanuel J Cand \`e s. Conformal inference for online prediction with arbitrary distribution shifts. Journal of Machine Learning Research, 25 0 (162): 0 1--36, 2024

  16. [16]

    Conformal alignment: Knowing when to trust foundation models with guarantees

    Yu Gui, Ying Jin, and Zhimei Ren. Conformal alignment: Knowing when to trust foundation models with guarantees. Advances in Neural Information Processing Systems, 37: 0 73884--73919, 2024

  17. [17]

    ACS : An interactive framework for conformal selection

    Yu Gui, Ying Jin, Yash Nair, and Zhimei Ren. ACS : An interactive framework for conformal selection. arXiv preprint arXiv:2507.15825, 2025

  18. [18]

    A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions

    Lei Huang, Weijiang Yu, Weitao Ma, Weihong Zhong, Zhangyin Feng, Haotian Wang, Qianglong Chen, Weihua Peng, Xiaocheng Feng, Bing Qin, et al. A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions. ACM Transactions on Information Systems, 43 0 (2): 0 1--55, 2025

  19. [19]

    Online selective conformal inference: adaptive scores, convergence rate and optimality

    Pierre Humbert, Ulysse Gazin, Ruth Heller, and Etienne Roquain. Online selective conformal inference: adaptive scores, convergence rate and optimality. arXiv preprint arXiv:2508.10336, 2025

  20. [20]

    Real-time selection under general constraints via predictive inference

    Yuyang Huo, Lin Lu, Haojie Ren, and Changliang Zou. Real-time selection under general constraints via predictive inference. Advances in Neural Information Processing Systems, 37: 0 61267--61305, 2024

  21. [21]

    On Online Control of False Discovery Rate

    Adel Javanmard and Andrea Montanari. On online control of false discovery rate. arXiv preprint arXiv:1502.06197, 2015

  22. [22]

    Online rules for control of false discovery rate and false discovery exceedance

    Adel Javanmard and Andrea Montanari. Online rules for control of false discovery rate and false discovery exceedance. The Annals of Statistics, 46 0 (2): 0 526--554, 2018

  23. [23]

    Model-free selective inference under covariate shift via weighted conformal p-values

    Ying Jin and Emmanuel J Cand \`e s. Model-free selective inference under covariate shift via weighted conformal p-values. arXiv preprint arXiv:2307.09291, 2023 a

  24. [24]

    Selection by prediction with conformal p-values

    Ying Jin and Emmanuel J Cand \`e s. Selection by prediction with conformal p-values. Journal of Machine Learning Research, 24 0 (244): 0 1--41, 2023 b

  25. [25]

    Candidate selection dataset

    Kaggle. Candidate selection dataset. https://www.kaggle.com/datasets/tarunchilkur/client, 2020

  26. [26]

    Diabetes health indicators dataset

    Kaggle. Diabetes health indicators dataset. https://www.kaggle.com/datasets/alexteboul/diabetes-health-indicators-dataset, 2021

  27. [27]

    Fdr control for online anomaly detection

    Etienne Kr \"o nert, Alain C \'e lisse, and Dalila Hattab. Fdr control for online anomaly detection. arXiv preprint arXiv:2312.01969, 2023

  28. [28]

    Adaptive novelty detection with false discovery rate guarantee

    Ariane Marandon, Lihua Lei, David Mary, and Etienne Roquain. Adaptive novelty detection with false discovery rate guarantee . The Annals of Statistics, 52 0 (1): 0 157 -- 183, 2024

  29. [29]

    Diversifying conformal selections

    Yash Nair, Ying Jin, James Yang, and Emmanuel Candes. Diversifying conformal selections. arXiv preprint arXiv:2506.16229, 2025

  30. [30]

    WATCH : Adaptive monitoring for AI deployments via weighted-conformal martingales

    Drew Prinster, Xing Han, Anqi Liu, and Suchi Saria. WATCH : Adaptive monitoring for AI deployments via weighted-conformal martingales. In Forty-second International Conference on Machine Learning, 2025

  31. [31]

    Online control of the false discovery rate with decaying memory

    Aaditya Ramdas, Fanny Yang, Martin J Wainwright, and Michael I Jordan. Online control of the false discovery rate with decaying memory. Advances in Neural Information Processing Systems, 30: 0 5655--5664, 2017

  32. [32]

    SAFFRON : an adaptive algorithm for online control of the false discovery rate

    Aaditya Ramdas, Tijana Zrnic, Martin Wainwright, and Michael Jordan. SAFFRON : an adaptive algorithm for online control of the false discovery rate. In International Conference on Machine Learning, pages 4286--4294. PMLR, 2018

  33. [33]

    A unified treatment of multiple testing with prior knowledge using the p-filter

    Aaditya K Ramdas, Rina F Barber, Martin J Wainwright, and Michael I Jordan. A unified treatment of multiple testing with prior knowledge using the p-filter. The Annals of Statistics, 47 0 (5): 0 2790--2821, 2019

  34. [34]

    Online false discovery rate control for anomaly detection in time series

    Quentin Rebjock, Baris Kurt, Tim Januschowski, and Laurent Callot. Online false discovery rate control for anomaly detection in time series. Advances in Neural Information Processing Systems, 34: 0 26487--26498, 2021

  35. [35]

    Online error rate control for platform trials

    David S Robertson, James MS Wason, Franz K \"o nig, Martin Posch, and Thomas Jaki. Online error rate control for platform trials. Statistics in Medicine, 42 0 (14): 0 2475--2495, 2023

  36. [36]

    Strong control, conservative point estimation and simultaneous conservative consistency of false discovery rates: a unified approach

    John D Storey, Jonathan E Taylor, and David Siegmund. Strong control, conservative point estimation and simultaneous conservative consistency of false discovery rates: a unified approach. Journal of the Royal Statistical Society Series B: Statistical Methodology, 66 0 (1): 0 187--205, 2004

  37. [37]

    ADDIS : an adaptive discarding algorithm for online FDR control with conservative nulls

    Jinjin Tian and Aaditya Ramdas. ADDIS : an adaptive discarding algorithm for online FDR control with conservative nulls. Advances in Neural Information Processing Systems, 32: 0 9388--9396, 2019

  38. [38]

    Testing randomness online

    Vladimir Vovk. Testing randomness online. Statistical Science, 36 0 (4): 0 595--611, 2021

  39. [39]

    Testing exchangeability on-line

    Vladimir Vovk, Ilia Nouretdinov, and Alexander Gammerman. Testing exchangeability on-line. In Proceedings of the 20th International Conference on Machine Learning, pages 768--775, 2003

  40. [40]

    Algorithmic learning in a random world

    Vladimir Vovk, Alexander Gammerman, and Glenn Shafer. Algorithmic learning in a random world. New York: Springer, 2005

  41. [41]

    Conformalized multiple testing after data-dependent selection

    Xiaoning Wang, Yuyang Huo, Liuhua Peng, and Changliang Zou. Conformalized multiple testing after data-dependent selection. Advances in Neural Information Processing Systems, 37: 0 58574--58609, 2024

  42. [42]

    Optimal subsampling via predictive inference

    Xiaoyang Wu, Yuyang Huo, Haojie Ren, and Changliang Zou. Optimal subsampling via predictive inference. Journal of the American Statistical Association, 119 0 (548): 0 2844--2856, 2024

  43. [43]

    Conditional testing based on localized conformal p-values

    Xiaoyang Wu, Lin Lu, Zhaojun Wang, and Changliang Zou. Conditional testing based on localized conformal p-values. In The Thirteenth International Conference on Learning Representations, 2025

  44. [44]

    Online multiple testing with e-values

    Ziyu Xu and Aaditya Ramdas. Online multiple testing with e-values. In International Conference on Artificial Intelligence and Statistics, pages 3997--4005. PMLR, 2024

  45. [45]

    Automs: automatic model selection for novelty detection with error rate control

    Yifan Zhang, Haiyan Jiang, Haojie Ren, Changliang Zou, and Dejing Dou. Automs: automatic model selection for novelty detection with error rate control. In Advances in Neural Information Processing Systems, 2022

  46. [46]

    e- GAI : e-value-based generalized alpha-investing for online false discovery rate control

    Yifan Zhang, Zijian Wei, Haojie Ren, and Changliang Zou. e- GAI : e-value-based generalized alpha-investing for online false discovery rate control. In Forty-second International Conference on Machine Learning, 2025

  47. [47]

    Tijana Zrnic, Aaditya Ramdas, and Michael I. Jordan. Asynchronous online testing of multiple hypotheses. Journal of Machine Learning Research, 22 0 (33): 0 1--39, 2021