pith. sign in

arxiv: 2604.17239 · v1 · submitted 2026-04-19 · 🧮 math.ST · econ.EM· stat.TH

Bootstrap consistency for general double/debiased machine learning estimators

Pith reviewed 2026-05-10 06:09 UTC · model grok-4.3

classification 🧮 math.ST econ.EMstat.TH
keywords bootstrap consistencydouble machine learningdebiased machine learningNeyman orthogonalitycross-fittingasymptotic normalityresampling
0
0 comments X

The pith

Bootstrap methods are valid for double/debiased machine learning estimators under exactly the conditions already required for their asymptotic normality.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proves that resampling schemes such as Efron's bootstrap produce a distribution that matches the sampling distribution of a DML estimator. This holds without any extra assumptions beyond those already needed for the estimator itself to be asymptotically normal. The result matters because bootstrap is often used in practice for DML inference, yet bootstrap can fail for other root-n consistent estimators and had lacked general justification in the DML setting. A reader who accepts the claim gains a theoretical basis to use bootstrap standard errors and intervals directly from the same orthogonal scores and cross-fitting already in place.

Core claim

Under exactly the same conditions required for the validity of DML itself, the bootstrap law converges conditionally weakly to the sampling law of the original estimator, and this holds for general exchangeably weighted resampling schemes with Efron's bootstrap as a special case.

What carries the argument

Neyman-orthogonal scores with cross-fitting, which remove the need for Donsker-type conditions and allow the bootstrap to track the estimator's limiting distribution.

Load-bearing premise

The DML estimator must satisfy the Neyman-orthogonality and cross-fitting rate conditions that already make it asymptotically normal.

What would settle it

A data-generating process in which the DML estimator is asymptotically normal yet the conditional distribution of the bootstrap version fails to converge to the same limit.

read the original abstract

Double/debiased machine learning (DML) provides a general framework for inference with high-dimensional or otherwise complex nuisance parameters by combining Neyman-orthogonal scores with cross-fitting, thereby circumventing classical Donsker-type conditions in many modern machine-learning settings. Despite its strong empirical performance, bootstrap inference for DML estimators has received little theoretical justification. This is particularly noteworthy since bootstrap methods are suggested ad used for inference on DML estimators, even though bootstrap procedures can fail for estimators that are root-$n$ consistent and asymptotically normal. This paper fills this gap by establishing bootstrap validity for DML estimators under general exchangeably weighted resampling schemes, with Efron's bootstrap as a special case. Under exactly the same conditions required for the validity of DML itself, we prove that the bootstrap law converges conditionally weakly to the sampling law of the original estimator.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 1 minor

Summary. The paper establishes bootstrap consistency for general double/debiased machine learning (DML) estimators. It proves that, for exchangeably weighted resampling schemes (with Efron's bootstrap as a special case), the bootstrap law converges conditionally weakly to the sampling distribution of the DML estimator, under precisely the same Neyman-orthogonality, cross-fitting, and rate conditions already required for the asymptotic normality of the DML estimator itself.

Significance. If the result holds, this supplies the missing theoretical justification for bootstrap inference with DML estimators, which are widely used in practice for high-dimensional and complex nuisance settings. The paper merits credit for deriving the result as a direct extension without introducing extra assumptions beyond those for DML validity, and for covering general exchangeably weighted schemes rather than a single bootstrap variant.

minor comments (1)
  1. [Abstract] Abstract: the phrase 'suggested ad used' is a typographical error and should read 'suggested and used'.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive and accurate summary of our manuscript on bootstrap consistency for general DML estimators. We appreciate the recommendation for minor revision and the recognition that the result holds under precisely the conditions already required for DML validity itself.

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper presents a direct mathematical proof that the bootstrap law converges conditionally weakly to the sampling law of the DML estimator, under precisely the same Neyman-orthogonality, cross-fitting, and rate conditions already required for DML asymptotic normality. No load-bearing step reduces the target bootstrap consistency result to a fitted parameter, a self-citation chain, or an ansatz smuggled from prior work by the same authors. The central claim is an independent theorem establishing validity for general exchangeably weighted resampling schemes, with no evidence that any equation or prediction is equivalent to its inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on the standard DML assumptions (Neyman orthogonality, cross-fitting, nuisance rate conditions) plus standard weak-convergence arguments for exchangeable bootstrap weights; no new free parameters or invented entities are introduced.

axioms (2)
  • domain assumption Neyman orthogonality of the score function
    Invoked to ensure the estimator remains root-n consistent even when nuisance parameters are estimated at slower rates.
  • domain assumption Cross-fitting to break dependence between nuisance estimation and score evaluation
    Required for the asymptotic linearity that the bootstrap proof builds upon.

pith-pipeline@v0.9.0 · 5436 in / 1237 out tokens · 31637 ms · 2026-05-10T06:09:05.336231+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

27 extracted references · 27 canonical work pages

  1. [1]

    and Imbens, G

    Abadie, A. and Imbens, G. W. (2008). On the failure of the bootstrap for matching estimators. Econometrica, 76(6):1537–1557

  2. [2]

    Andrews, D. W. (1994). Empirical process methods in econometrics.Handbook of Econometrics, 4:2247–2294

  3. [3]

    Beran, R. (1987). Prepivoting to reduce level error of confidence sets.Biometrika, 74(3):457–468

  4. [4]

    and van der Laan, M

    Cai, W. and van der Laan, M. (2020). Nonparametric bootstrap inference for the targeted highly adaptive least absolute shrinkage and selection operator (lasso) estimator.The International Journal of Biostatistics, 16(2):20170070

  5. [5]

    and Huang, J

    Cheng, G. and Huang, J. Z. (2010). Bootstrap consistency for general semiparametric M-estimation. The Annals of Statistics, 38(5):2884–2915

  6. [6]

    Chernozhukov, V., Chetverikov, D., Demirer, M., Duflo, E., Hansen, C., Newey, W., and Robins, J. (2018). Double/debiased machine learning for treatment and structural parameters.The Econometrics Journal, 21(1):C1–C68

  7. [7]

    Chernozhukov, V., Chetverikov, D., and Kato, K. (2014). Gaussian approximation of suprema of empirical processes.The Annals of Statistics, 42(4):1564–1597. 28

  8. [8]

    Diciccio, T. J. and Romano, J. P. (1988). A review of bootstrap confidence intervals.Journal of the Royal Statistical Society Series B: Statistical Methodology, 50(3):338–354

  9. [9]

    Dukes, O., Vansteelandt, S., and Whitney, D. (2024). On doubly robust inference for double machine learning in semiparametric regression.Journal of Machine Learning Research, 25(279):1–46

  10. [10]

    Efron, B. (1979). Bootstrap methods: Another look at the jackknife.The Annals of Statistics, 7(1):1–26

  11. [11]

    Fingerhut, N., Sesia, M., and Romano, Y. (2022). Coordinated double machine learning. InInter- national Conference on Machine Learning, pages 6499–6513. PMLR

  12. [12]

    Gonnet, G. H. (1981). Expected length of the longest probe sequence in hash code searching. Journal of the ACM (JACM), 28(2):289–304. Hájek, J. (1961). Some extensions of the Wald-Wolfowitz-Noether theorem.The Annals of Mathe- matical Statistics, 32(2):506–523

  13. [13]

    Hall, P. (1988). Theoretical comparison of bootstrap confidence intervals.The Annals of Statistics, 16(3):927–953

  14. [14]

    Imbens, G. W. (2024). Causal inference in the social sciences.Annual Review of Statistics and Its Application, 11:123–152

  15. [15]

    Kosorok, M. R. (2008).Introduction to Empirical Processes and Semiparametric Inference. Springer

  16. [16]

    Lin, Z., Ding, P., and Han, F. (2023). Estimation based on nearest neighbor matching: from density ratio to average treatment effect.Econometrica, 91(6):2187–2217

  17. [17]

    and Han, F

    Lin, Z. and Han, F. (2024). On the failure of the bootstrap for Chatterjee’s rank correlation. Biometrika, 111(3):1063–1070

  18. [18]

    and Han, F

    Lin, Z. and Han, F. (2025). On regression-adjusted imputation estimators of the average treatment effect.Journal of Econometrics, 251:106080

  19. [19]

    and Han, F

    Lin, Z. and Han, F. (2026). On the consistency of bootstrap for matching estimators.Biometrika, 113(1):asag005

  20. [20]

    Luenberger, D. G. (1997).Optimization by Vector Space Methods. John Wiley and Sons

  21. [21]

    Mason, D. M. and Newton, M. A. (1992). A rank statistics approach to the consistency of a general bootstrap.The Annals of Statistics, 20(3):1611–1624

  22. [22]

    and Wellner, J

    Praestgaard, J. and Wellner, J. A. (1993). Exchangeably weighted bootstraps of the general empir- ical process.The Annals of Probability, 21(4):2053–2086. 29

  23. [23]

    Balls into bins

    Raab, M. and Steger, A. (1998). “Balls into bins”—a simple and tight analysis. InInternational Workshop on Randomization and Approximation Techniques in Computer Science, pages 159–170. Springer

  24. [24]

    Rubin, D. B. (1981). The bayesian bootstrap.The Annals of Statistics, 9(1):130–134

  25. [25]

    Consistency of the bootstrap for asymptotically linear estimators based on machine learning.arXiv preprint arXiv:2404.03064,

    Tang, Z. and Westling, T. (2024). Consistency of the bootstrap for asymptotically linear estimators based on machine learning.arXiv preprint arXiv:2404.03064

  26. [26]

    Wellner, J. A. and Zhan, Y. (1996). Bootstrapping Z-estimators.University of Washington Depart- ment of Statistics Technical Report, 308(5)

  27. [27]

    Wu, C.-F. J. (1986). Jackknife, bootstrap and other resampling methods in regression analysis.the Annals of Statistics, 14(4):1261–1295. 30