A Uniform Improvement of the Benjamini-Hochberg Procedure via e-Closure

Jelle Goeman

arxiv: 2606.01854 · v2 · pith:ITL5SBHAnew · submitted 2026-06-01 · 📊 stat.ME

A Uniform Improvement of the Benjamini-Hochberg Procedure via e-Closure

Jelle Goeman This is my paper

Pith reviewed 2026-06-28 13:34 UTC · model grok-4.3

classification 📊 stat.ME

keywords false discovery ratemultiple testingclosed testingBenjamini-Hochberge-closureuniform improvementPRDS

0 comments

The pith

Closed BH is a uniform improvement on the Benjamini-Hochberg procedure that can reject additional hypotheses while still controlling the false discovery rate.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper constructs closed BH by applying the e-Closure principle to the classic Benjamini-Hochberg method. The resulting procedure controls the false discovery rate under the same positive regression dependency on a subset condition as the original method, and also under a weaker minimal sufficient condition on the test statistics. Because it is a uniform improvement, closed BH never rejects fewer hypotheses than Benjamini-Hochberg but can reject substantially more, with the gain largest when many null hypotheses are false.

Core claim

Closed BH is a uniform improvement of the Benjamini-Hochberg (BH) procedure. It is valid under the positive regression dependency on a subset (PRDS) assumption as well as under an alternative weaker minimal sufficient condition. The procedure is obtained via the e-Closure principle and never rejects fewer hypotheses than BH while it may reject more, especially when the number of false null hypotheses is large.

What carries the argument

The e-Closure principle, which converts an existing multiple-testing method into a closed testing procedure that preserves or strengthens its error control.

If this is right

Closed BH controls the false discovery rate under PRDS, the same condition required by the original Benjamini-Hochberg procedure.
Closed BH also controls the false discovery rate under a weaker minimal sufficient condition on the test statistics.
Closed BH rejects at least as many hypotheses as Benjamini-Hochberg for every data set and can reject additional hypotheses when many nulls are false.
The method is implemented in the eClosure R package.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The weaker minimal sufficient condition may allow closed BH to be used in dependence structures where standard Benjamini-Hochberg theory does not apply.
Because the gain occurs mainly when many alternatives are true, closed BH may be especially useful in high-dimensional screening settings with dense signals.
The same e-Closure construction could in principle be applied to other false-discovery-rate procedures to obtain uniform improvements.

Load-bearing premise

The e-Closure principle applied to Benjamini-Hochberg yields a closed testing procedure that controls the false discovery rate under PRDS or the stated weaker condition on the test statistics.

What would settle it

A data set or simulation in which closed BH controls the false discovery rate at the nominal level yet rejects strictly fewer hypotheses than the standard Benjamini-Hochberg procedure.

Figures

Figures reproduced from arXiv: 2606.01854 by Jelle Goeman.

**Figure 1.** Figure 1: E(eN ) for closed BH (“cBH”) and BH for the simulation settings of Section 8. Here, target is the parameter t. 8.2 Applications Next, adjusted p-values were calculated for several data sets used for illustration in well-known publications on multiple testing, following Xu et al. (2025). The number of rejected hypotheses for closed BH versus BH is given in [PITH_FULL_IMAGE:figures/full_fig_p015_1.png] view at source ↗

**Figure 2.** Figure 2: Realized FDR of closed BH (“cBH”) versus MABH and BH for the simulation settings of [PITH_FULL_IMAGE:figures/full_fig_p016_2.png] view at source ↗

**Figure 3.** Figure 3: Average power of closed BH (“cBH”) versus MABH and BH for the simulation settings of [PITH_FULL_IMAGE:figures/full_fig_p017_3.png] view at source ↗

**Figure 4.** Figure 4: Adjusted p-values with closed BH versus BH for the 6 data sets of Table 1. The open dots [PITH_FULL_IMAGE:figures/full_fig_p018_4.png] view at source ↗

read the original abstract

This paper presents closed BH, a uniform improvement of the False Discovery Rate controlling method of Benjamini and Hochberg (BH). Closed BH is valid under the same assumption of Positive Regression Dependency on a Subset (PRDS) as BH, but also under an alternative and weaker minimal sufficient condition. As a uniform improvement, closed BH never rejects fewer hypotheses than BH, but it may reject quite a few more. An increase in power is observed especially when the number of false null hypotheses is large. The novel method is constructed using the e-Closure principle, a recently derived general principle for multiple testing. The method is implemented in the eClosure package in R.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Closed BH uses e-Closure to deliver a uniform power gain over standard BH under PRDS or a weaker condition.

read the letter

The main thing here is that Goeman builds closed BH from the e-Closure principle and shows it rejects at least as many hypotheses as the original Benjamini-Hochberg procedure while controlling FDR under the usual PRDS assumption or a weaker minimal sufficient condition on the test statistics. The gain shows up most when many nulls are false, which matches what one would expect in dense-signal settings.

The construction itself is the clear novelty. It turns the general e-Closure idea into a concrete, implementable procedure that dominates BH pointwise in rejections. The R package is a practical plus and lets people check the behavior directly.

The soft spot is the reliance on e-Closure delivering the closed testing property without extra cost; the abstract states the result but the full paper needs to spell out the steps so readers can see exactly where the weaker condition enters and whether it is easy to use in applications. No obvious circularity or hidden fitting appears in the description.

This is for statisticians who work on multiple testing methods or apply them in high-throughput data. Readers who already know BH and want to see whether a uniform improvement is possible will get value from the argument. The claim is sharp enough and the area important enough that the paper deserves a serious referee rather than a desk reject.

Referee Report

2 major / 3 minor

Summary. The manuscript introduces closed BH, a procedure for controlling the false discovery rate (FDR) that is constructed via the e-Closure principle. It is claimed to be a uniform improvement over the classical Benjamini-Hochberg (BH) step-up procedure: it never rejects fewer hypotheses than BH while sometimes rejecting substantially more, with the gain largest when many nulls are false. Validity is asserted under the standard PRDS condition on the test statistics and also under a weaker minimal sufficient condition; the method is implemented in the eClosure R package.

Significance. If the validity and uniform-improvement claims hold, the result supplies a strictly more powerful FDR-controlling procedure that remains computationally simple and requires no additional tuning parameters. The construction via e-Closure illustrates how a general multiple-testing principle can be specialized to recover and improve a classical method, which may encourage similar applications elsewhere.

major comments (2)

[§3, Theorem 1] §3, Theorem 1: the proof that closed BH controls FDR under the weaker minimal sufficient condition appears to rely on the e-Closure principle preserving the closed-testing structure; however, the argument that the resulting p-value thresholds remain valid for the intersection hypotheses is only sketched and requires an explicit verification that the e-values satisfy the necessary monotonicity.
[§4.2, Algorithm 1] §4.2, Algorithm 1: the uniform improvement property (never rejecting fewer than BH) is stated to follow directly from the ordering of the e-Closure thresholds, but the proof does not address the case in which some e-values are exactly equal to the critical value; a short additional argument or counter-example check is needed.

minor comments (3)

[Abstract / Introduction] The abstract and introduction cite the e-Closure principle as “recently derived” but do not give the precise reference; adding the citation in the first paragraph would improve traceability.
[Figure 2] Figure 2 caption states that power gains are “especially” visible for large numbers of false nulls, yet the simulation design fixes the proportion of false nulls at 0.2; either the caption or the figure legend should be clarified.
[Notation throughout] Notation: the symbol m is used both for the total number of hypotheses and for the number of true nulls in different paragraphs; a consistent symbol (e.g., m_0 for true nulls) would reduce ambiguity.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the positive evaluation and the constructive comments, which help strengthen the presentation of the proofs. We address each major comment below and will revise the manuscript to incorporate the requested clarifications.

read point-by-point responses

Referee: [§3, Theorem 1] §3, Theorem 1: the proof that closed BH controls FDR under the weaker minimal sufficient condition appears to rely on the e-Closure principle preserving the closed-testing structure; however, the argument that the resulting p-value thresholds remain valid for the intersection hypotheses is only sketched and requires an explicit verification that the e-values satisfy the necessary monotonicity.

Authors: We agree that the argument in the proof of Theorem 1 is presented as a sketch and would benefit from an explicit verification. In the revised manuscript we will expand this part to include a direct check that the e-values constructed via e-Closure satisfy the monotonicity condition required for the intersection hypotheses to remain valid under the minimal sufficient condition. revision: yes
Referee: [§4.2, Algorithm 1] §4.2, Algorithm 1: the uniform improvement property (never rejecting fewer than BH) is stated to follow directly from the ordering of the e-Closure thresholds, but the proof does not address the case in which some e-values are exactly equal to the critical value; a short additional argument or counter-example check is needed.

Authors: We acknowledge that the current proof of the uniform improvement property does not explicitly treat the boundary case of equality between an e-value and the critical threshold. We will add a short supplementary argument showing that, when equality occurs, the rejection set of closed BH remains at least as large as that of BH (by adopting the same tie-breaking convention as the original BH procedure). revision: yes

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The paper constructs closed BH via the e-Closure principle, presented as an external recently derived general principle. The claims of uniform improvement over BH, validity under PRDS or a weaker minimal sufficient condition, and power gains do not reduce by construction to fitted parameters, self-definitions, or unverified self-citations. The derivation chain is self-contained against the stated external principle and assumptions, with no quoted equations or steps exhibiting equivalence to inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on the e-Closure principle and the PRDS/weaker condition; no free parameters or invented entities are mentioned in the abstract.

axioms (2)

domain assumption The e-Closure principle applies to construct a valid closed testing version of BH
The novel method is constructed using the e-Closure principle.
domain assumption Test statistics satisfy PRDS or the alternative weaker minimal sufficient condition
Validity of FDR control is stated under these conditions.

pith-pipeline@v0.9.1-grok · 5631 in / 1213 out tokens · 31503 ms · 2026-06-28T13:34:38.205906+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

38 extracted references · 2 linked inside Pith

[1]

Journal of the Royal Statistical Society: Series B (Methodological) , volume =

Benjamini, Yoav and Hochberg, Yosef , title =. Journal of the Royal Statistical Society: Series B (Methodological) , volume =. 1995 , doi =

1995
[2]

Journal of Educational and Behavioral Statistics , volume =

Benjamini, Yoav and Hochberg, Yosef , title =. Journal of Educational and Behavioral Statistics , volume =. 2000 , doi =

2000
[3]

Annals of statistics , pages=

The control of the false discovery rate in multiple testing under dependency , author=. Annals of statistics , pages=. 2001 , publisher=

2001
[4]

, title =

Storey, John D. , title =. Journal of the Royal Statistical Society: Series B (Statistical Methodology) , volume =. 2002 , doi =

2002
[5]

and Taylor, Jonathan E

Storey, John D. and Taylor, Jonathan E. and Siegmund, David , title =. Journal of the Royal Statistical Society: Series B (Statistical Methodology) , volume =. 2004 , doi =

2004
[6]

and Yekutieli, Daniel , title =

Benjamini, Yoav and Krieger, Abba M. and Yekutieli, Daniel , title =. Biometrika , volume =. 2006 , doi =

2006
[7]

Electronic Journal of Statistics , volume =

Blanchard, Gilles and Roquain, Etienne , title =. Electronic Journal of Statistics , volume =. 2008 , doi =

2008
[8]

Journal of Machine Learning Research , volume =

Blanchard, Gilles and Roquain, Etienne , title =. Journal of Machine Learning Research , volume =
[9]

, title =

Sarkar, Sanat K. , title =. Journal of Statistical Planning and Inference , volume =. 2008 , doi =

2008
[10]

, title =

Gavrilov, Yulia and Benjamini, Yoav and Sarkar, Sanat K. , title =. The Annals of Statistics , volume =. 2009 , doi =

2009
[11]

The Annals of Statistics , volume =

Finner, Helmut and Dickhaus, Thorsten and Roters, Markus , title =. The Annals of Statistics , volume =. 2009 , doi =

2009
[12]

Journal of Statistical Planning and Inference , volume =

Heesen, Philipp and Janssen, Arnold , title =. Journal of Statistical Planning and Inference , volume =. 2016 , doi =

2016
[13]

Electronic Journal of Statistics , volume =

MacDonald, Peter and Liang, Kun and Janssen, Arnold , title =. Electronic Journal of Statistics , volume =. 2019 , doi =

2019
[14]

, title =

Solari, Aldo and Goeman, Jelle J. , title =. Biometrical Journal , volume =. 2017 , doi =

2017
[15]

arXiv preprint arXiv:2310.06357 , year =

Gao, Zijun , title =. arXiv preprint arXiv:2310.06357 , year =

arXiv
[16]

arXiv preprint arXiv:2603.17984 , year =

Gao, Zijun and Roquain, Etienne , title =. arXiv preprint arXiv:2603.17984 , year =

arXiv
[17]

Biometrika , volume =

Ignatiadis, Nikolaos and Wang, Ruodu and Ramdas, Aaditya , title =. Biometrika , volume =. 2024 , doi =

2024
[18]

arXiv preprint arXiv:2603.21424 , year =

Ignatiadis, Nikolaos and Wang, Ruodu and Ramdas, Aaditya , title =. arXiv preprint arXiv:2603.21424 , year =

arXiv
[19]

Journal of the Royal Statistical Society: Series B (Statistical Methodology) , volume =

Liang, Kun and Nettleton, Dan , title =. Journal of the Royal Statistical Society: Series B (Statistical Methodology) , volume =. 2012 , doi =

2012
[20]

arXiv preprint arXiv:2509.02517 , year=

Bringing closure to false discovery rate control: A general principle for multiple testing , author=. arXiv preprint arXiv:2509.02517 , year=

arXiv
[21]

Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=

False discovery rate control with e-values , author=. Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=. 2022 , publisher=

2022
[22]

arXiv preprint arXiv:1812.08965 , year=

The FDR-linking theorem , author=. arXiv preprint arXiv:1812.08965 , year=

Pith/arXiv arXiv
[23]

Biometrika , volume=

An improved Bonferroni procedure for multiple tests of significance , author=. Biometrika , volume=. 1986 , publisher=

1986
[24]

The Annals of Statistics , volume=

A unified treatment of multiple testing with prior knowledge using the p-filter , author=. The Annals of Statistics , volume=. 2019 , publisher=

2019
[25]

Biometrika , volume=

A sharper Bonferroni procedure for multiple tests of significance , author=. Biometrika , volume=. 1988 , publisher=

1988
[26]

Scandinavian Journal of Statistics , volume =

Holm, Sture , title =. Scandinavian Journal of Statistics , volume =
[27]

Statistics in medicine , volume=

Multiple hypothesis testing in genomics , author=. Statistics in medicine , volume=. 2014 , publisher=

2014
[28]

2021 , publisher=

Computer age statistical inference, student edition: algorithms, evidence, and data science , author=. 2021 , publisher=

2021
[29]

The Annals of Statistics , volume=

Conditional calibration for false discovery rate control under dependence , author=. The Annals of Statistics , volume=. 2022 , publisher=

2022
[30]

arXiv preprint arXiv:2401.03834 , year=

On the error control of invariant causal prediction , author=. arXiv preprint arXiv:2401.03834 , year=

Pith/arXiv arXiv
[31]

Biometrical Journal , volume=

On the false discovery rate and expected type I errors , author=. Biometrical Journal , volume=. 2001 , publisher=

2001
[32]

Journal of the American Statistical Association , volume=

Filtering the rejection set while preserving false discovery rate control , author=. Journal of the American Statistical Association , volume=. 2023 , publisher=

2023
[33]

Briefings in bioinformatics , volume=

Inflated false discovery rate due to volcano plots: problem and solutions , author=. Briefings in bioinformatics , volume=. 2021 , publisher=

2021
[34]

Biometrika , volume=

On closed testing procedures with special reference to ordered analysis of variance , author=. Biometrika , volume=. 1976 , publisher=

1976
[35]

The Annals of Statistics , year =

Vovk, Vladimir and Wang, Ruodu , title =. The Annals of Statistics , year =
[36]

Safe testing , journal =

Gr\". Safe testing , journal =. 2024 , volume =

2024
[37]

Foundations and Trends in Statistics , year =

Ramdas, Aaditya and Wang, Ruodu , title =. Foundations and Trends in Statistics , year =
[38]

On methods controlling the false discovery rate , author=. Sankhy. 2008 , publisher=

2008

[1] [1]

Journal of the Royal Statistical Society: Series B (Methodological) , volume =

Benjamini, Yoav and Hochberg, Yosef , title =. Journal of the Royal Statistical Society: Series B (Methodological) , volume =. 1995 , doi =

1995

[2] [2]

Journal of Educational and Behavioral Statistics , volume =

Benjamini, Yoav and Hochberg, Yosef , title =. Journal of Educational and Behavioral Statistics , volume =. 2000 , doi =

2000

[3] [3]

Annals of statistics , pages=

The control of the false discovery rate in multiple testing under dependency , author=. Annals of statistics , pages=. 2001 , publisher=

2001

[4] [4]

, title =

Storey, John D. , title =. Journal of the Royal Statistical Society: Series B (Statistical Methodology) , volume =. 2002 , doi =

2002

[5] [5]

and Taylor, Jonathan E

Storey, John D. and Taylor, Jonathan E. and Siegmund, David , title =. Journal of the Royal Statistical Society: Series B (Statistical Methodology) , volume =. 2004 , doi =

2004

[6] [6]

and Yekutieli, Daniel , title =

Benjamini, Yoav and Krieger, Abba M. and Yekutieli, Daniel , title =. Biometrika , volume =. 2006 , doi =

2006

[7] [7]

Electronic Journal of Statistics , volume =

Blanchard, Gilles and Roquain, Etienne , title =. Electronic Journal of Statistics , volume =. 2008 , doi =

2008

[8] [8]

Journal of Machine Learning Research , volume =

Blanchard, Gilles and Roquain, Etienne , title =. Journal of Machine Learning Research , volume =

[9] [9]

, title =

Sarkar, Sanat K. , title =. Journal of Statistical Planning and Inference , volume =. 2008 , doi =

2008

[10] [10]

, title =

Gavrilov, Yulia and Benjamini, Yoav and Sarkar, Sanat K. , title =. The Annals of Statistics , volume =. 2009 , doi =

2009

[11] [11]

The Annals of Statistics , volume =

Finner, Helmut and Dickhaus, Thorsten and Roters, Markus , title =. The Annals of Statistics , volume =. 2009 , doi =

2009

[12] [12]

Journal of Statistical Planning and Inference , volume =

Heesen, Philipp and Janssen, Arnold , title =. Journal of Statistical Planning and Inference , volume =. 2016 , doi =

2016

[13] [13]

Electronic Journal of Statistics , volume =

MacDonald, Peter and Liang, Kun and Janssen, Arnold , title =. Electronic Journal of Statistics , volume =. 2019 , doi =

2019

[14] [14]

, title =

Solari, Aldo and Goeman, Jelle J. , title =. Biometrical Journal , volume =. 2017 , doi =

2017

[15] [15]

arXiv preprint arXiv:2310.06357 , year =

Gao, Zijun , title =. arXiv preprint arXiv:2310.06357 , year =

arXiv

[16] [16]

arXiv preprint arXiv:2603.17984 , year =

Gao, Zijun and Roquain, Etienne , title =. arXiv preprint arXiv:2603.17984 , year =

arXiv

[17] [17]

Biometrika , volume =

Ignatiadis, Nikolaos and Wang, Ruodu and Ramdas, Aaditya , title =. Biometrika , volume =. 2024 , doi =

2024

[18] [18]

arXiv preprint arXiv:2603.21424 , year =

Ignatiadis, Nikolaos and Wang, Ruodu and Ramdas, Aaditya , title =. arXiv preprint arXiv:2603.21424 , year =

arXiv

[19] [19]

Journal of the Royal Statistical Society: Series B (Statistical Methodology) , volume =

Liang, Kun and Nettleton, Dan , title =. Journal of the Royal Statistical Society: Series B (Statistical Methodology) , volume =. 2012 , doi =

2012

[20] [20]

arXiv preprint arXiv:2509.02517 , year=

Bringing closure to false discovery rate control: A general principle for multiple testing , author=. arXiv preprint arXiv:2509.02517 , year=

arXiv

[21] [21]

Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=

False discovery rate control with e-values , author=. Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=. 2022 , publisher=

2022

[22] [22]

arXiv preprint arXiv:1812.08965 , year=

The FDR-linking theorem , author=. arXiv preprint arXiv:1812.08965 , year=

Pith/arXiv arXiv

[23] [23]

Biometrika , volume=

An improved Bonferroni procedure for multiple tests of significance , author=. Biometrika , volume=. 1986 , publisher=

1986

[24] [24]

The Annals of Statistics , volume=

A unified treatment of multiple testing with prior knowledge using the p-filter , author=. The Annals of Statistics , volume=. 2019 , publisher=

2019

[25] [25]

Biometrika , volume=

A sharper Bonferroni procedure for multiple tests of significance , author=. Biometrika , volume=. 1988 , publisher=

1988

[26] [26]

Scandinavian Journal of Statistics , volume =

Holm, Sture , title =. Scandinavian Journal of Statistics , volume =

[27] [27]

Statistics in medicine , volume=

Multiple hypothesis testing in genomics , author=. Statistics in medicine , volume=. 2014 , publisher=

2014

[28] [28]

2021 , publisher=

Computer age statistical inference, student edition: algorithms, evidence, and data science , author=. 2021 , publisher=

2021

[29] [29]

The Annals of Statistics , volume=

Conditional calibration for false discovery rate control under dependence , author=. The Annals of Statistics , volume=. 2022 , publisher=

2022

[30] [30]

arXiv preprint arXiv:2401.03834 , year=

On the error control of invariant causal prediction , author=. arXiv preprint arXiv:2401.03834 , year=

Pith/arXiv arXiv

[31] [31]

Biometrical Journal , volume=

On the false discovery rate and expected type I errors , author=. Biometrical Journal , volume=. 2001 , publisher=

2001

[32] [32]

Journal of the American Statistical Association , volume=

Filtering the rejection set while preserving false discovery rate control , author=. Journal of the American Statistical Association , volume=. 2023 , publisher=

2023

[33] [33]

Briefings in bioinformatics , volume=

Inflated false discovery rate due to volcano plots: problem and solutions , author=. Briefings in bioinformatics , volume=. 2021 , publisher=

2021

[34] [34]

Biometrika , volume=

On closed testing procedures with special reference to ordered analysis of variance , author=. Biometrika , volume=. 1976 , publisher=

1976

[35] [35]

The Annals of Statistics , year =

Vovk, Vladimir and Wang, Ruodu , title =. The Annals of Statistics , year =

[36] [36]

Safe testing , journal =

Gr\". Safe testing , journal =. 2024 , volume =

2024

[37] [37]

Foundations and Trends in Statistics , year =

Ramdas, Aaditya and Wang, Ruodu , title =. Foundations and Trends in Statistics , year =

[38] [38]

On methods controlling the false discovery rate , author=. Sankhy. 2008 , publisher=

2008