High Dimensional Gaussian and Bootstrap Approximations in Generalized Linear Models
Pith reviewed 2026-05-16 13:56 UTC · model grok-4.3
The pith
Bootstrap approximations remain valid for the Lasso-penalized GLM estimator over convex sets and Euclidean balls when dimension grows exponentially with sample size under suitable sparsity.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
While the Gaussian approximation to the high-dimensional GLM estimator fails when d grows exponentially with n, the bootstrap approximations over Borel convex sets and Euclidean balls remain valid for the relevant part of the Lasso-penalized estimator under the growth conditions log d = o(n^{2τ/3}) and number of nonzero parameters o(n^{1/3-4τ/3}) with λ_n ∼ n^{1/2+τ} for τ ∈ (0,1/4).
What carries the argument
Lasso-penalized GLM estimator together with residual or data bootstrap procedures that approximate its distribution uniformly over collections of convex sets and Euclidean balls.
If this is right
- Reliable confidence regions can be constructed for GLM parameters in high-dimensional medical and survey data without assuming normality.
- The same bootstrap procedures apply uniformly to both moderate and ultra-high-dimensional regimes under the respective conditions.
- Inference remains valid for the sparse part of the estimator even when the full Gaussian limit does not exist.
- Finite-sample performance matches the theoretical rates in simulations across both regimes.
Where Pith is reading between the lines
- The methods could be tested on other link functions or penalties beyond the Lasso to check robustness.
- If the design concentration conditions weaken, one might need stronger resampling variants to restore coverage.
- These approximations open the door to valid simultaneous inference on many GLM coefficients when classical central-limit results are unavailable.
Load-bearing premise
The link function is sufficiently smooth and the design matrix satisfies moment and concentration conditions that keep remainder terms small at the stated rates.
What would settle it
A simulation or real dataset in which log d exceeds the allowed growth rate or sparsity violates the bound and the bootstrap coverage for convex sets or balls falls well below the nominal level would refute the validity claim.
read the original abstract
Generalized Linear Model (or GLM) extends the ordinary linear regression by linking the mean of the response variable to covariates through appropriate link functions. GLM is widely used in the analysis of datasets arising from diverse fields including medical sciences, clinical trials, population surveys and risk analysis. In this paper, we investigate the Gaussian and Bootstrap approximations of GLM under two separate high dimensional regimes: (I) when the dimension $d$ grows slower than $n$ and (II) when $d$ grows exponentially with $n$. Under regime (I), we essentially show that the Gaussian approximation holds over the collection of Borel convex sets when $d = o\big(n^{2/5}\big)$ and over the collection of Euclidean balls when $d = o\big(n^{1/2}\big)$. We further devise two high dimensional Bootstrap methods which are valid over the collections of Borel convex sets and Euclidean balls under the same dimension growth rates. Then we move to regime (II) where we invoke sparsity to GLM through Lasso. We show that the high dimensional Gaussian approximation fails under regime (II). However, the Bootstrap approximations over convex sets and Euclidean balls are valid for the relevant part of the GLM estimator provided $\log d = o\big(n^{2\tau/3}\big)$ and the number of non-zero regression parameters is $o\big(n^{1/3- 4\tau/3}\big)$, when the Lasso penalty $\lambda_n \sim n^{1/2 + \tau}$, for some $\tau \in (0, 1/4)$. Simulation studies confirm the strong finite-sample performance of our proposed Bootstrap methods under both regime (I) and (II). We also implement our methods on real datasets.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper studies Gaussian and bootstrap approximations to the MLE (and its Lasso-regularized version) in generalized linear models. In the moderate high-dimensional regime (d = o(n)), it proves that the Gaussian approximation to the centered and scaled estimator is valid uniformly over Borel convex sets when d = o(n^{2/5}) and over Euclidean balls when d = o(n^{1/2}); corresponding bootstrap procedures are shown to be valid at the same rates. In the ultra-high-dimensional regime, the plain Gaussian approximation fails, but bootstrap approximations remain valid over the same classes of sets for the relevant (de-biased) component of the Lasso estimator, provided log d = o(n^{2τ/3}) and the sparsity level s = o(n^{1/3-4τ/3}) when the penalty satisfies λ_n ∼ n^{1/2+τ} for τ ∈ (0,1/4). The results are illustrated by simulations and real-data examples.
Significance. If the stated rates and bootstrap validity hold, the work supplies theoretically justified inference procedures for GLMs precisely where the Gaussian approximation breaks down because of the Lasso bias term or dimensionality. The explicit separation between the two regimes and the demonstration that bootstrap can capture the penalty-induced bias under the given sparsity conditions are the main contributions; they directly address a practical need in high-dimensional medical and risk-analysis applications.
major comments (2)
- [§4] §4 (Regime II): the statement that the Gaussian approximation fails while the bootstrap succeeds is central, yet the precise mechanism by which the bootstrap absorbs the λ_n-induced bias term is not quantified in the abstract or the high-level description; an explicit comparison of the remainder terms (e.g., the order of the bias that the bootstrap captures versus the Gaussian remainder) is needed to substantiate the claim.
- [Assumptions preceding Theorem 3.1] Assumption set for Regime I (link-function smoothness and design concentration): the rates d = o(n^{2/5}) and d = o(n^{1/2}) are derived under C^3 smoothness of the link and uniform restricted-eigenvalue-type conditions; these assumptions must be stated with explicit constants or moment bounds, because they directly determine whether the claimed dimension thresholds are attainable for common link functions such as logit or log.
minor comments (2)
- [Simulation studies] The simulation section should report the exact values of n, d, s, and the chosen link functions together with the empirical coverage probabilities for both convex-set and ball approximations, so that the finite-sample behavior can be compared directly with the theoretical thresholds.
- [Notation] Notation for the two regimes should be unified; the symbol τ appears only in Regime II while the moderate-dimensional rates are written with explicit powers of n, which makes cross-referencing slightly cumbersome.
Simulated Author's Rebuttal
We thank the referee for the careful reading, positive recommendation, and constructive comments on our manuscript. We address each major comment below and have revised the paper accordingly to improve clarity and substantiation.
read point-by-point responses
-
Referee: [§4] §4 (Regime II): the statement that the Gaussian approximation fails while the bootstrap succeeds is central, yet the precise mechanism by which the bootstrap absorbs the λ_n-induced bias term is not quantified in the abstract or the high-level description; an explicit comparison of the remainder terms (e.g., the order of the bias that the bootstrap captures versus the Gaussian remainder) is needed to substantiate the claim.
Authors: We agree that an explicit comparison strengthens the central claim. In the revised manuscript we have added a new paragraph in Section 4.3 that quantifies the orders: the Gaussian approximation to the de-biased Lasso estimator carries an extra bias remainder of order λ_n √s (which fails to vanish under the ultra-high-dimensional regime), whereas the bootstrap distribution, by resampling the penalized objective, automatically incorporates this term and yields a valid approximation whose remainder is o_p(1) under the stated conditions log d = o(n^{2τ/3}) and s = o(n^{1/3-4τ/3}). We also updated the abstract and the high-level summary in the introduction to reference this order comparison. revision: yes
-
Referee: [Assumptions preceding Theorem 3.1] Assumption set for Regime I (link-function smoothness and design concentration): the rates d = o(n^{2/5}) and d = o(n^{1/2}) are derived under C^3 smoothness of the link and uniform restricted-eigenvalue-type conditions; these assumptions must be stated with explicit constants or moment bounds, because they directly determine whether the claimed dimension thresholds are attainable for common link functions such as logit or log.
Authors: We thank the referee for this observation. The original assumptions already impose C^3 smoothness with bounded third derivative and uniform restricted-eigenvalue conditions on the design, but we have revised the statement preceding Theorem 3.1 to include explicit constants: the third derivative of the link is bounded by M (with M = 1/4 for the logit link and M = 1 for the log link) and the design satisfies E[|X_{ij}|^4] ≤ C_0 together with a uniform lower bound on the restricted eigenvalues. These explicit bounds confirm that the stated dimension thresholds are attainable for the logit and log links under standard sub-Gaussian designs. revision: yes
Circularity Check
No significant circularity; derivations rely on external concentration and bootstrap theory
full rationale
The paper establishes Gaussian and bootstrap approximation rates for GLMs in two regimes using standard linearization of the score function, moment bounds, and uniform restricted eigenvalue conditions on the design. The stated dimension and sparsity thresholds (d = o(n^{2/5}), log d = o(n^{2τ/3}), s = o(n^{1/3-4τ/3})) follow directly from controlling remainder terms under C^3 link smoothness and sub-exponential tails, without any self-definitional closure, fitted-parameter renaming, or load-bearing self-citation. The failure of plain Gaussian approximation under Lasso is shown by explicit bias terms that the bootstrap captures, all derived from the same external inequalities rather than from quantities defined inside the paper.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption The GLM link function is sufficiently smooth and the covariates satisfy concentration inequalities that support the high-dimensional remainder bounds
Reference graph
Works this paper leans on
-
[1]
A., GISH, K., YBARRA, S., & MACK, D
ALON, U., BARKAI, N., NOTTERMAN, D. A., GISH, K., YBARRA, S., & MACK, D. (1999). Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues.Proceedings of the National Academy of Sciences,96(12), 6745– 6750
work page 1999
-
[2]
BALL, K. (1993). The reverse isoperimetric problem for Gaussian measure.Discrete & Computational Geometry,10(4), 411–420
work page 1993
-
[3]
BENTKUS, V. (1986). Dependence of the Berry–Esseen estimate on the dimension.Lithua- nian Mathematical Journal,26(2), 110–114
work page 1986
-
[4]
BENTKUS, V. (2003). On the dependence of the Berry–Esseen bound on dimension.Jour- nal of Statistical Planning and Inference,113(2), 385–402
work page 2003
-
[5]
BENTKUS, V. (2005). A Lyapunov-type bound inR d.Theory of Probability & Its Appli- cations,49(2), 311–323
work page 2005
-
[6]
BERKSON, J. (1944). Application of the logistic function to bio-assay.Journal of the American Statistical Association,39(227), 357–365
work page 1944
-
[7]
BHATTACHARYA, R. N. & RAO, R. R. (1986).Normal Approximation and Asymptotic Expansions. vol.64SIAM
work page 1986
-
[8]
BONIS, T. (2020). Stein’s method for normal approximation in Wasserstein distances. Probability Theory and Related Fields,178(3), 827–860
work page 2020
-
[9]
B ¨UHLMANN, P. &VAN DEGEER, S. (2011).Statistics for High-Dimensional Data: Meth- ods, Theory and Applications. Springer
work page 2011
-
[10]
BUNEA, F. (2008).Honest variable selection in linear and logistic regression models via l1 andl 1+l2 penalization.Electronic Journal of Statistics.2, 1153–1194
work page 2008
-
[11]
CHERNOZHUKOV, V., CHETVERIKOV, D., & KATO, K. (2013). Gaussian approximations and multiplier bootstrap for maxima of sums.The Annals of Statistics,41(6), 2786– 2819
work page 2013
-
[12]
CHERNOZHUKOV, V., CHETVERIKOV, D., & KATO, K. (2017). Central limit theorems and bootstrap in high dimensions.The Annals of Probability,45(4), 2309–2352
work page 2017
-
[13]
CHERNOZHUKOV, V., CHETVERIKOV, D., KATO, K., & KOIKE, Y. (2022). Improved central limit theorem and bootstrap approximations in high dimensions.The Annals of Statistics,50(5), 2562–2586
work page 2022
-
[14]
COX, D. R. (1958). The regression analysis of binary sequences.Journal of the Royal Statistical Society: Series B,20(2), 215–232. HIGH DIMENSIONAL GAUSSIAN AND BOOTSTRAP APPROXIMATIONS IN GLM 69
work page 1958
-
[15]
DAS, D., & LAHIRI, S. N. (2019). Distributional consistency of the Lasso by perturbation bootstrap.Biometrika,106(4), 957–964
work page 2019
-
[16]
ELDAN, R., MIKULINCER, D., & ZHAI, A. (2020). The CLT in high dimensions: quan- titative bounds.The Annals of Probability,48(5), 2494–2524
work page 2020
-
[17]
FAN, J., & LI, R. (2001). Variable selection via nonconcave penalized likelihood.JASA, 96(456), 1348–1360
work page 2001
-
[18]
FAN, J., & LV, J. (2008). Sure independence screening.JRSS-B,70(5), 849–911
work page 2008
-
[19]
FAN, J., & PENG, H. (2004). Nonconcave penalized likelihood with diverging parameters. Annals of Statistics,32(3), 928–961
work page 2004
-
[20]
FANG, X., & KOIKE, Y. (2021). High-dimensional CLTs by Stein’s method.The Annals of Applied Probability,31(4), 1660–1686
work page 2021
-
[21]
FANG, X., & KOIKE, Y. (2022). New error bounds in multivariate normal approximations via exchangeable pairs.The Annals of Applied Probability,32(1), 602–631
work page 2022
-
[22]
FANG, X., & KOIKE, Y. (2024). Large-dimensional central limit theorem with fourth- moment error bounds.The Annals of Applied Probability,34(2), 2065–2106
work page 2024
-
[23]
HE, X., & SHAO, Q.-M. (2000). On parameters of increasing dimensions.Journal of Multivariate Analysis,73(1), 120–135
work page 2000
-
[24]
HUANG, J., SABRI, M. M. S., ULRIKH, D. V., AHMAD, M., & ALSAFFAR, K. A. M. (2022). Predicting the compressive strength of the cement–fly ash–slag ternary con- crete using the firefly algorithm (FA) and random forest (RF) hybrid machine-learning method.Materials,15(12), 4193
work page 2022
-
[25]
HUBER, P. J. (1973). Robust regression: asymptotics, conjectures and Monte Carlo.The Annals of Statistics, 799–821
work page 1973
-
[26]
JIN, Z., YING, Z., & WEI, L. (2001). A simple resampling method by perturbing the minimand.Biometrika,88(2), 381–390
work page 2001
-
[27]
KANE, D. M. (2011). The Gaussian surface area and noise sensitivity of polynomial threshold functions.Computational Complexity,20(2), 389–412
work page 2011
-
[28]
R., O’DONNELL, R., & SERVEDIO, R
KLIVANS, A. R., O’DONNELL, R., & SERVEDIO, R. A. (2008). Learning geometric concepts via Gaussian surface area. InFOCS 2008, 541–550
work page 2008
-
[29]
KNIGHT, K., & FU, W. (2000). Asymptotics for Lasso-type estimators.The Annals of Statistics,28(5), 1356–1378
work page 2000
-
[30]
High-dimensional CLT for Sums of Non-degenerate Random Vectors: n^
KUCHIBHOTLA, A. K., & RINALDO, A. (2020). High-dimensional CLT for sums of non-degenerate random vectors.arXiv preprint arXiv:2009.13673
-
[31]
LAHIRI, S. N. (2021). Necessary and sufficient conditions for Lasso VSC.The Annals of Statistics,49(2), 820–844
work page 2021
-
[32]
LIANG, Y., CAO, C.-X., & ZHAO, H. (2013). Sparse logistic regression with anL 1/2 penalty for gene selection in cancer classification.BMC Bioinformatics,14:198
work page 2013
-
[33]
LOPES, M. E. (2022). Central limit theorem and bootstrap approximation in high dimen- sions.The Annals of Statistics,50(5), 2492–2513
work page 2022
-
[34]
MAMMEN, E. (1989). Asymptotics with increasing dimension for robust regression with applications to the bootstrap.The Annals of Statistics, 382–400
work page 1989
-
[35]
MEINSHAUSEN, N., & B ¨UHLMANN, P. (2006). High-dimensional graphs and variable selection.Annals of Statistics,34(3), 1436–1462
work page 2006
-
[36]
NAGAEV, S. V. (1976). An estimate of the remainder term in the multidimensional central limit theorem. InProc. Third Japan–USSR Symposium on Probability Theory, 419–438. Springer
work page 1976
-
[37]
NAZAROV, F. (2003). On the maximal perimeter of a convex set inR n. InGeometric Aspects of Functional Analysis, 167–189. 70 MAYUKH CHOUDHURY AND DEBRAJ DAS
work page 2003
-
[38]
NELDER, J. A., & WEDDERBURN, R. W. M. (1972). Generalized linear models.Journal of the Royal Statistical Society: Series A,135(3), 370–384
work page 1972
-
[39]
NG, T. L. & NEWTON, M. A. (2022).Random weighting in Lasso regression.Electronic Journal of Statistics.16(1), 3430–3481
work page 2022
-
[40]
PORTNOY, S. (1984). Asymptotic behavior of M-estimators ofpregression parameters whenp 2/nis large. I. Consistency.Annals of Statistics,12(4), 1298–1309
work page 1984
-
[41]
PORTNOY, S. (1985). Asymptotic behavior of M-estimators ofpregression parameters whenp 2/nis large. II. Normal approximation.The Annals of Statistics,13(4), 1403– 1417
work page 1985
-
[42]
PORTNOY, S. (1988). Asymptotic behavior of likelihood methods for exponential families. Annals of Statistics,16, 356–366
work page 1988
-
[43]
RAI ˇC, M. (2019a). A multivariate Berry–Esseen theorem with explicit constants. Bernoulli,25(4A), 1–30
-
[44]
RAI ˇC, M. (2019b). A multivariate CLT for Lipschitz and smooth test functions.arXiv preprint arXiv:1812.08268
work page internal anchor Pith review Pith/arXiv arXiv
-
[45]
SAZONOV, V. V. (1972). On a bound for the rate of convergence in the multidimensional CLT. InProc. Sixth Berkeley Symp., vol.6, 563–582
work page 1972
-
[46]
TIBSHIRANI, R. (1996). Regression shrinkage and selection via the Lasso.JRSS-B,58(1), 267–288
work page 1996
-
[47]
TIBSHIRANI, R. J. (2013). The lasso problem and uniqueness.Electronic Journal of Statistics7, 1456–1490
work page 2013
-
[48]
(2018).High-Dimensional Probability: An Introduction with Applica- tions in Data Science
VERSHYNIN, R. (2018).High-Dimensional Probability: An Introduction with Applica- tions in Data Science. Cambridge University Press
work page 2018
-
[49]
WAINWRIGHT, M. J. (2009). Sharp thresholds for sparsity recovery.IEEE Trans. Info. Theory,55(5), 2183–2202
work page 2009
-
[50]
WAINWRIGHT, M. J. (2019).High-Dimensional Statistics: A Non-Asymptotic Viewpoint. Cambridge University Press
work page 2019
-
[51]
WELSH, A. H. (1989). On M-processes and M-estimation.The Annals of Statistics, 337– 361
work page 1989
-
[52]
YEH, I.-C. (1998). Modeling of strength of high-performance concrete using artificial neural networks.Cement and Concrete Research,28(12), 1797–1808
work page 1998
-
[53]
YOHAI, V. J., & MARONNA, R. A. (1979). Asymptotic behavior of M-estimators for the linear model.The Annals of Statistics, 258–268
work page 1979
-
[54]
ZHAI, A. (2018). A high-dimensional CLT inW 2 distance.Probability Theory and Re- lated Fields,170, 821–845
work page 2018
-
[55]
ZHAO, P., & YU, B. (2006). On model selection consistency of the Lasso.JMLR,7, 2541–2563
work page 2006
-
[56]
ZHILOVA, M. (2020). Nonclassical Berry–Esseen inequalities.The Annals of Statistics, 48(4), 1922–1939. DEPARTMENT OFMATHEMATICS, INDIANINSTITUTE OFTECHNOLOGYBOMBAY, MUMBAI400076, INDIA Email address:214090002@iitb.ac.in DEPARTMENT OFMATHEMATICS, INDIANINSTITUTE OFTECHNOLOGYBOMBAY, MUMBAI400076, INDIA Email address:debrajdas@math.iitb.ac.in
work page 2020
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.