Multiple testing
Pith reviewed 2026-06-26 03:16 UTC · model grok-4.3
The pith
This text introduces multiple hypothesis testing by covering error criteria and testing procedures with R package references.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The text provides an introduction to multiple hypothesis testing. It covers various error criteria and testing procedures, and includes references to relevant R packages.
What carries the argument
Multiple testing procedures that control error rates such as family-wise error rate or false discovery rate when many hypotheses are tested at once.
If this is right
- Users gain the ability to select appropriate error control when performing many simultaneous tests.
- Practical implementation is supported by the referenced R packages.
- The material supports teaching of multiple testing concepts at an advanced level.
Where Pith is reading between the lines
- The notes could serve as a foundation for researchers entering fields that require high-dimensional testing.
- They highlight the need to match error criteria to the scientific goal of the analysis.
- Similar lecture notes might be adapted for other statistical topics with software examples.
Load-bearing premise
The descriptions of error criteria and testing procedures accurately reflect established methods in the statistical literature.
What would settle it
A demonstration that one of the described procedures fails to control the stated error rate under the conditions given in the text.
Figures
read the original abstract
This text provides an introduction to multiple hypothesis testing. It covers various error criteria and testing procedures, and includes references to relevant R packages. An earlier version of this text served as the lecture notes for a PhD-level course on multiple testing.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript is an expository introduction to multiple hypothesis testing. It covers error criteria (FWER, FDR and variants), standard procedures (Bonferroni, Holm, Benjamini-Hochberg and related step-up/step-down methods), and points readers to R packages for implementation. The text originated as PhD-level lecture notes.
Significance. If the descriptions match the established literature, the manuscript could function as a compact teaching aid for graduate students. Because it advances no new methods, proofs, or empirical results, its contribution to the research literature in statistical methodology is minimal.
minor comments (2)
- Add a table of contents or explicit section numbering to improve usability as standalone lecture notes.
- Include version numbers or last-update dates for the cited R packages (e.g., multtest, qvalue) so readers can reproduce the examples.
Simulated Author's Rebuttal
We thank the referee for reviewing our manuscript. We agree that it is an expository introduction based on PhD lecture notes, covering established error criteria and procedures along with R package references, without introducing new methods or results.
read point-by-point responses
-
Referee: If the descriptions match the established literature, the manuscript could function as a compact teaching aid for graduate students. Because it advances no new methods, proofs, or empirical results, its contribution to the research literature in statistical methodology is minimal.
Authors: We concur that the manuscript does not advance new methodology, proofs, or empirical findings, as its scope is limited to summarizing standard approaches and directing readers to implementations. This aligns with its origin as lecture notes intended for instructional use rather than original research. We maintain that such consolidated expository resources can still offer pedagogical value for students and practitioners seeking an accessible overview. revision: no
Circularity Check
No circularity: purely expository introduction with no derivations or predictions
full rationale
The manuscript is an expository introduction to multiple hypothesis testing methods drawn from the established statistical literature. It covers error criteria, procedures, and R packages but contains no derivations, predictions, fitted parameters, or novel claims. The reader's weakest assumption (accurate reflection of standard methods) is external to the paper and does not create internal circularity. No load-bearing steps reduce to self-definition, self-citation chains, or fitted inputs by construction.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Anderson, M. J. and Robinson, J. Permutation tests for linear models. Australian & New Zealand Journal of Statistics, 43 0 (1): 0 75--88, 2001
2001
-
[2]
Permutation-based true discovery proportions for functional magnetic resonance imaging cluster analysis
Andreella, A., Hemerik, J., Finos, L., Weeda, W., and Goeman, J. Permutation-based true discovery proportions for functional magnetic resonance imaging cluster analysis. Statistics in Medicine, 42 0 (14): 0 2311--2340, 2023
2023
-
[3]
Barber, R. F. and Candes, E. Controlling the false discovery rate via knockoffs. The Annals of Statistics, 43 0 (5): 0 2055--2085, 2015
2055
-
[4]
F., Candes, E., Janson, L., Patterson, E., and Sesia, M
Barber, R. F., Candes, E., Janson, L., Patterson, E., and Sesia, M. The Knockoff Filter for Controlled Variable Selection, 2022. URL https://CRAN.R-project.org/package=knockoff. R package version 0.3.6
2022
-
[5]
and Hochberg, Y
Benjamini, Y. and Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the royal statistical society. Series B (Methodological), pages 289--300, 1995
1995
-
[6]
and Yekutieli, D
Benjamini, Y. and Yekutieli, D. The control of the false discovery rate in multiple testing under dependency. Annals of statistics, pages 1165--1188, 2001
2001
-
[7]
Notip: Non-parametric true discovery proportion control for brain imaging
Blain, A., Thirion, B., and Neuvial, P. Notip: Non-parametric true discovery proportion control for brain imaging. NeuroImage, 260: 0 119492, 2022
2022
-
[8]
R., Linhart, J., Thirion, B., and Neuvial, P
Blain, A., Lobo, A. R., Linhart, J., Thirion, B., and Neuvial, P. When knockoffs fail: diagnosing and fixing non-exchangeability of knockoffs. arXiv preprint arXiv:2407.06892, 2024
-
[9]
B., Benjamini, Y., and Sabatti, C
Bogomolov, M., Peterson, C. B., Benjamini, Y., and Sabatti, C. Hypotheses on a tree: new error rates and testing strategies. Biometrika, 108 0 (3): 0 575--590, 2021
2021
-
[10]
High-dimensional statistics with a view toward applications in biology
B \"u hlmann, P., Kalisch, M., and Meier, L. High-dimensional statistics with a view toward applications in biology. Annual review of statistics and its application, 1 0 (1): 0 255--278, 2014
2014
-
[11]
A., Romano, J
Canay, I. A., Romano, J. P., and Shaikh, A. M. Randomization tests under an approximate symmetry assumption. Econometrica, 85 0 (3): 0 1013--1030, 2017
2017
-
[12]
Panning for gold:‘model-x’knockoffs for high dimensional controlled variable selection
Candes, E., Fan, Y., Janson, L., and Lv, J. Panning for gold:‘model-x’knockoffs for high dimensional controlled variable selection. Journal of the Royal Statistical Society Series B: Statistical Methodology, 80 0 (3): 0 551--577, 2018
2018
-
[13]
P., and Wolf, M
Clarke, D., Romano, J. P., and Wolf, M. The R omano-- W olf multiple-hypothesis correction in S tata. The S tata Journal , 20 0 (4): 0 812--843, 2020
2020
-
[14]
Cock, D. D. Ames, I owa: Alternative to the B oston housing data as an end of semester regression project. Journal of Statistics Education, 19 0 (3): 0 1--15, 2011
2011
-
[15]
and Flachaire, E
Davidson, R. and Flachaire, E. The wild bootstrap, tamed at last. Journal of Econometrics, 146 0 (1): 0 162--169, 2008
2008
-
[16]
J., Davenport, S., Hemerik, J., and Finos, L
De Santis, R., Goeman, J. J., Davenport, S., Hemerik, J., and Finos, L. Permutation-based multiple testing when fitting many generalized linear models. Electronic Journal of Statistics, 19 0 (2): 0 3317--3332, 2025 a
2025
-
[17]
J., Hemerik, J., Davenport, S., and Finos, L
De Santis, R., Goeman, J. J., Hemerik, J., Davenport, S., and Finos, L. Inference in generalized linear models with robustness to misspecified variances. Journal of the American Statistical Association, 120 0 (552): 0 2762--2771, 2025 b
2025
-
[18]
and Roquain, E
Delattre, S. and Roquain, E. New procedures controlling the false discovery proportion via R omano-- W olf’s heuristic. The Annals of Statistics, 43 0 (3): 0 1141--1177, 2015
2015
-
[19]
and Scheer, M
Dikta, G. and Scheer, M. Bootstrap methods. Springer, 2021
2021
-
[20]
False Discovery Exceedance Controlling Multiple Testing Procedures, 2024
Dohler, S., Junge, F., and Roquain, E. False Discovery Exceedance Controlling Multiple Testing Procedures, 2024. URL https://CRAN.R-project.org/package=FDX. R package version 2.0.2
2024
-
[21]
and Van Der Laan, M
Dudoit, S. and Van Der Laan, M. J. Multiple testing procedures with applications to genomics. Springer, 2008
2008
-
[22]
Fay, M. P. and Brittain, E. H. Statistical Hypothesis Testing in Context: Volume 52: Reproducibility, Inference, and Science, volume 52. Cambridge University Press, 2022
2022
-
[23]
On the false discovery rate and an asymptotically optimal rejection curve
Finner, H., Dickhaus, T., and Roters, M. On the false discovery rate and an asymptotically optimal rejection curve. The Annals of Statistics, pages 596--618, 2009
2009
-
[24]
Fisher, R. A. The design of experiments. Oliver and Boyd, 1935
1935
-
[25]
and Lane, D
Freedman, D. and Lane, D. A nonstochastic interpretation of reported significance levels. Journal of Business & Economic Statistics, 1 0 (4): 0 292--298, 1983
1983
-
[26]
Genovese, C. R. and Wasserman, L. Exceedance control of the false discovery proportion. Journal of the American Statistical Association, 101 0 (476): 0 1408--1417, 2006
2006
-
[27]
Goeman, J. J. and Solari, A. Multiple testing for exploratory research. Statistical Science, 26 0 (4): 0 584--597, 2011
2011
-
[28]
Goeman, J. J. and Solari, A. Multiple hypothesis testing in genomics. Statistics in medicine, 33 0 (11): 0 1946--1978, 2014
1946
-
[29]
J., Meijer, R
Goeman, J. J., Meijer, R. J., Krebs, T. J., and Solari, A. Simultaneous control of all false discovery proportions in large-scale multiple hypothesis testing. Biometrika, 106 0 (4): 0 841--856, 2019
2019
-
[30]
J., Hemerik, J., and Solari, A
Goeman, J. J., Hemerik, J., and Solari, A. Only closed testing procedures are admissible for controlling false discovery proportions. The Annals of Statistics, 49 0 (2): 0 1218--1238, 2021
2021
-
[31]
J., Meijer, R., and Krebs, T
Goeman, J. J., Meijer, R., and Krebs, T. Methods for Closed Testing with Simes Inequality, in Particular Hommel's Method, 2025. URL https://CRAN.R-project.org/package=hommel. R package version 1.8
2025
-
[32]
and Goeman, J
Hemerik, J. and Goeman, J. J. False discovery proportion estimation by permutations: confidence for significance analysis of microarrays. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 80 0 (1): 0 137--155, 2018 a
2018
-
[33]
Permutation-based simultaneous confidence bounds for the false discovery proportion
Hemerik, J., Solari, A., and Goeman, J. Permutation-based simultaneous confidence bounds for the false discovery proportion. Biometrika, 106 0 (3): 0 635--649, 2019
2019
-
[34]
and Goeman, J
Hemerik, J. and Goeman, J. J. Exact testing with random permutations. TEST, 27 0 (4): 0 811--825, 2018 b
2018
-
[35]
J., and Finos, L
Hemerik, J., Goeman, J. J., and Finos, L. Robust testing in generalized linear models by sign flipping score contributions. Journal of the Royal Statistical Society Series B: Statistical Methodology, 82 0 (3): 0 841--864, 2020
2020
-
[36]
and Tamhane, A
Hochberg, Y. and Tamhane, A. C. Multiple comparison procedures. John Wiley & Sons, Inc., 1987
1987
-
[37]
A simple sequentially rejective multiple test procedure
Holm, S. A simple sequentially rejective multiple test procedure. Scandinavian journal of statistics, pages 65--70, 1979
1979
-
[38]
A stagewise rejective multiple test procedure based on a modified bonferroni test
Hommel, G. A stagewise rejective multiple test procedure based on a modified bonferroni test. Biometrika, pages 383--386, 1988
1988
-
[39]
Studentized permutation tests for non-iid hypotheses and the generalized behrens-fisher problem
Janssen, A. Studentized permutation tests for non-iid hypotheses and the generalized behrens-fisher problem. Statistics & probability letters, 36 0 (1): 0 9--21, 1997
1997
- [40]
-
[41]
Koning, N. W. and Hemerik, J. More efficient exact group invariance testing: using a representative subgroup. Biometrika, 111 0 (2): 0 441--458, 2024
2024
-
[42]
The A mes I owa Housing data , 2025
Kuhn, M. The A mes I owa Housing data , 2025. URL https://CRAN.R-project.org/package=AmesHousing. R package version 0.0.4
2025
-
[43]
Lehmann, E. L. and Romano, J. P. Testing statistical hypotheses. Springer Science & Business Media, 2022
2022
-
[44]
Lehmann, E. L. and Romano, J. P. Generalizations of the familywise error rate. volume 33, pages 1138--1154. 2005
2005
-
[45]
J., Krebs, T
Meijer, R. J., Krebs, T. J., and Goeman, J. J. Hommel's procedure in linear time. Biometrical Journal, 61 0 (1): 0 73--82, 2019
2019
-
[46]
S., Dudoit, S., and van der Laan, M
Pollard, K. S., Dudoit, S., and van der Laan, M. J. R package multtest. URL https://www.bioconductor.org/packages/release/bioc/html/multtest.html
-
[47]
S., Dudoit, S., and van der Laan, M
Pollard, K. S., Dudoit, S., and van der Laan, M. J. Multiple testing procedures: the multtest package and applications to genomics. In Bioinformatics and computational biology solutions using R and bioconductor, pages 249--271. Springer, 2005
2005
-
[48]
Potter, D. M. A permutation test for inference in logistic regression with small-and moderate-sized data sets. Statistics in medicine, 24 0 (5): 0 693--708, 2005
2005
-
[49]
Ramdas, A. and Wang, R. Hypothesis testing with e-values. Foundations and Trends in Statistics , 1 0 (1-2): 0 1--390, 2025. doi:10.1561/STA
work page doi:10.1561/sta 2025
-
[50]
F., Cand \`e s, E
Ramdas, A., Barber, R. F., Cand \`e s, E. J., and Tibshirani, R. J. Permutation tests using arbitrary permutation distributions. Sankhya A, 85 0 (2): 0 1156--1177, 2023
2023
-
[51]
Romano, J. P. On the behavior of randomization tests without a group invariance assumption. Journal of the American Statistical Association, 85 0 (411): 0 686--692, 1990
1990
-
[52]
Romano, J. P. and Shaikh, A. M. On stepdown control of the false discovery proportion. Lecture Notes-Monograph Series, pages 33--50, 2006
2006
-
[53]
Romano, J. P. and Wolf, M. Stepwise multiple testing as formalized data snooping. Econometrica, 73 0 (4): 0 1237--1282, 2005
2005
-
[54]
Romano, J. P. and Wolf, M. Control of generalized error rates in multiple testing. The Annals of Statistics, 35 0 (4): 0 1378--1408, 2007
2007
-
[55]
Romano, J. P. and Wolf, M. Efficient computation of adjusted p-values for resampling-based stepdown multiple testing. Statistics & Probability Letters, 113: 0 38--40, 2016
2016
-
[56]
Deep knockoffs
Romano, Y., Sesia, M., and Cand \`e s, E. Deep knockoffs. Journal of the American Statistical Association, 115 0 (532): 0 1861--1872, 2020
2020
-
[57]
Sarkar, S. K. Some probability inequalities for ordered mtp 2 random variables: a proof of the simes conjecture. Annals of Statistics, pages 494--504, 1998
1998
-
[58]
Solari, A., Finos, L., and Goeman, J. J. Rotation-based multiple testing in the multivariate linear model. Biometrics, 70 0 (4): 0 954--961, 2014
2014
-
[59]
K., Kim, S
Southworth, L. K., Kim, S. K., and Owen, A. B. Properties of balanced permutations. Journal of Computational Biology, 16 0 (4): 0 625--638, 2009
2009
-
[60]
Spreij, P. J. Measure theoretic probability. Course Notes, 2023. URL https://staff.fnwi.uva.nl/p.j.c.spreij/onderwijs/master/mtp.pdf
2023
-
[61]
Storey, J. D. A direct approach to false discovery rates. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 64 0 (3): 0 479--498, 2002
2002
-
[62]
Vesely, A., Finos, L., and Goeman, J. J. Permutation-based true discovery guarantee by sum tests. Journal of the Royal Statistical Society Series B: Statistical Methodology, 85 0 (3): 0 664--683, 2023
2023
-
[63]
Elementary proofs of several results on false discovery rate
Wang, R. Elementary proofs of several results on false discovery rate. arXiv preprint arXiv:2201.09350, 2022
-
[64]
Westfall, P. H. and Young, S. S. Resampling-based multiple testing: Examples and methods for p-value adjustment, volume 279. John Wiley & Sons, 1993
1993
-
[65]
M., Ridgway, G
Winkler, A. M., Ridgway, G. R., Webster, M. A., Smith, S. M., and Nichols, T. E. Permutation inference for the general linear model. Neuroimage, 92: 0 381--397, 2014
2014
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.