pith. machine review for the scientific record. sign in

arxiv: 2502.06719 · v3 · submitted 2025-02-10 · 📊 stat.ML · cs.LG· math.OC· math.PR· math.ST· stat.TH

Recognition: unknown

Gaussian Approximation and Multiplier Bootstrap for Stochastic Gradient Descent

Authors on Pith no claims yet
classification 📊 stat.ML cs.LGmath.OCmath.PRmath.STstat.TH
keywords approximationbootstrapdescentgaussiangradientmultipliernon-asymptoticstochastic
0
0 comments X
read the original abstract

In this paper, we establish the non-asymptotic validity of the multiplier bootstrap procedure for constructing the confidence sets using the Stochastic Gradient Descent (SGD) algorithm. Under appropriate regularity conditions, our approach avoids the need to approximate the limiting covariance of Polyak-Ruppert SGD iterates, which allows us to derive approximation rates in convex distance of order up to $1/\sqrt{n}$. Notably, this rate can be faster than the one that can be proven in the Polyak-Juditsky central limit theorem. To our knowledge, this provides the first fully non-asymptotic bound on the accuracy of bootstrap approximations in SGD algorithms. Our analysis builds on the Gaussian approximation results for nonlinear statistics of independent random variables.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. When Does Dynamic Preconditioning Preserve the Polyak-Ruppert CLT? A Stabilization Threshold

    math.ST 2026-04 unverdicted novelty 7.0

    Dynamic preconditioning preserves the Polyak-Ruppert CLT for averaged SGD if the preconditioner stabilizes at rate β > (α + 1)/2.

  2. Gaussian Approximation for Asynchronous Q-learning

    stat.ML 2026-04 unverdicted novelty 7.0

    Derived rates of order up to n^{-1/6} log^4(n S A) for the high-dimensional CLT of averaged asynchronous Q-learning iterates, plus a general martingale-difference CLT.

  3. Refining Covariance Matrix Estimation in Stochastic Gradient Descent Through Bias Reduction

    stat.ML 2026-04 unverdicted novelty 6.0

    A novel bias-reduced online covariance estimator for SGD achieves convergence rate n to the power (α-1)/2 times square root of log n without second-order derivatives.