Error Bounds for Importance Sampling with Estimated Proposal Distributions
Pith reviewed 2026-05-20 04:12 UTC · model grok-4.3
The pith
Error bounds for importance sampling with estimated proposals separate the Monte Carlo error from the proposal approximation error.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We address this gap by deriving non-asymptotic error bounds for standard, defensive, and self-normalized importance sampling estimators with random proposals. Our results separate the Monte Carlo error, scaling as n^{-1/2}, from the proposal approximation error measured through the mean integrated absolute and squared errors (MIAE and MISE) of the kernel density estimate. To obtain explicit convergence rates in (N,n), we establish MIAE and MISE bounds for KDEs constructed from geometrically ergodic Markov chains in stationary and non-stationary regimes. Combining these results yields quantitative guarantees for importance sampling with KDE-based proposals.
What carries the argument
Non-asymptotic error bounds that decompose total error into a Monte Carlo component of order n to the power minus one half and a proposal approximation component given by the mean integrated absolute or squared error of the kernel density estimate from the auxiliary Markov chain samples.
Load-bearing premise
The auxiliary samples for constructing the proposal estimate are produced by a geometrically ergodic Markov chain.
What would settle it
A simulation in which the observed error fails to separate into an n to the power of minus one half term and an integrated absolute or squared error term of the kernel density estimate would contradict the derived bounds.
Figures
read the original abstract
Importance sampling with data-driven proposal distributions is widely used in practice. A common workflow first generates an auxiliary sample of size $N$ from an approximation of the target distribution, constructs a density estimate $\hat q$ such as a kernel density estimator (KDE), and then draws $n$ importance samples from this learned proposal. Despite its practical relevance, the theoretical properties of this hierarchical procedure remain poorly understood, since classical importance sampling theory assumes a fixed proposal. We address this gap by deriving non-asymptotic error bounds for standard, defensive, and self-normalized importance sampling estimators with random proposals. Our results separate the Monte Carlo error, scaling as $n^{-1/2}$, from the proposal approximation error measured through the mean integrated absolute and squared errors (MIAE and MISE) of $\hat q$. To obtain explicit convergence rates in $(N,n)$, we establish MIAE and MISE bounds for KDEs constructed from geometrically ergodic Markov chains in stationary and non-stationary regimes. Combining these results yields quantitative guarantees for importance sampling with KDE-based proposals. Our theory provides practical guidance for selecting defensive mixture weights in a nonparametric importance sampling framework.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper derives non-asymptotic error bounds for standard, defensive, and self-normalized importance sampling estimators that employ a random proposal distribution constructed via kernel density estimation from an auxiliary sample of size N. The auxiliary sample is drawn from a geometrically ergodic Markov chain, and the bounds separate the Monte Carlo error term of order n^{-1/2} from the proposal approximation error measured by the mean integrated absolute error (MIAE) and mean integrated squared error (MISE) of the KDE. Explicit convergence rates in (N,n) are obtained for both stationary and non-stationary regimes, yielding quantitative guarantees and guidance on defensive mixture weights.
Significance. If the central bounds hold, the manuscript fills a notable gap in the theory of importance sampling by providing rigorous non-asymptotic analysis for the common practical workflow of using data-driven proposals. The clean separation of Monte Carlo and approximation errors, together with explicit MIAE/MISE rates for KDEs under geometric ergodicity, supplies practical guidance on sample-size allocation and defensive weighting that is currently missing from the literature. The work correctly invokes standard mixing arguments and kernel-density results to obtain its rates.
minor comments (3)
- [§3.2] §3.2: the statement of the defensive estimator could be accompanied by a short remark clarifying how the mixture weight interacts with the random proposal to avoid potential reader confusion with the standard IS case.
- [Theorem 4.3] Theorem 4.3: the dependence of the leading constants on the geometric ergodicity rate is left implicit; a brief remark on how these constants scale with the mixing parameter would strengthen the practical utility of the rate statements.
- [Figure 1] Figure 1: the caption should explicitly state the values of N and n used in the simulation so that the plotted error curves can be directly compared to the derived bounds.
Simulated Author's Rebuttal
We thank the referee for their supportive summary of the manuscript and for recommending minor revision. The assessment correctly identifies the separation of Monte Carlo and proposal approximation errors as a central contribution.
Circularity Check
No significant circularity; derivation self-contained
full rationale
The paper derives non-asymptotic error bounds for importance sampling estimators with random KDE proposals by separating the Monte Carlo term (scaling as n^{-1/2}) from the proposal error measured in MIAE/MISE. These bounds are obtained by conditioning on an auxiliary sample from a geometrically ergodic Markov chain and invoking standard mixing and KDE convergence results for both stationary and non-stationary regimes. No step reduces by the paper's own equations to a fitted parameter, self-defined quantity, or load-bearing self-citation chain; the central claims rest on external, verifiable assumptions about ergodicity and kernel estimation that are not redefined within the work. The argument structure remains independent once the stated assumptions are granted.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption The auxiliary Markov chain is geometrically ergodic.
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We address this gap by deriving non-asymptotic error bounds for standard, defensive, and self-normalized importance sampling estimators with random proposals. Our results separate the Monte Carlo error, scaling as n^{-1/2}, from the proposal approximation error measured through the mean integrated absolute and squared errors (MIAE and MISE) of ˆq.
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
To obtain explicit convergence rates in (N,n), we establish MIAE and MISE bounds for KDEs constructed from geometrically ergodic Markov chains in stationary and non-stationary regimes.
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
The Annals of Statistics , number =
Hanyuan Hang and Ingo Steinwart , title =. The Annals of Statistics , number =. 2017 , doi =
work page 2017
-
[2]
The Annals of Statistics , number =
Bin Yu , title =. The Annals of Statistics , number =. 1993 , doi =
work page 1993
-
[3]
The Annals of Statistics , pages=
Data-driven bandwidth choice for density estimation based on dependent data , author=. The Annals of Statistics , pages=. 1990 , publisher=
work page 1990
-
[4]
IEEE Transactions on Information Theory , volume=
Recursive probability density estimation for weakly dependent stationary processes , author=. IEEE Transactions on Information Theory , volume=. 1986 , publisher=
work page 1986
-
[5]
Journal of Time Series Analysis , volume=
Nonparametric estimators for time series , author=. Journal of Time Series Analysis , volume=. 1983 , publisher=
work page 1983
-
[6]
Handbook of computational statistics: Concepts and methods , pages=
Multivariate density estimation and visualization , author=. Handbook of computational statistics: Concepts and methods , pages=. 2011 , publisher=
work page 2011
-
[7]
Electronic Journal of Statistics , number =
Daniel Rudolf and Bj. Electronic Journal of Statistics , number =. 2020 , doi =
work page 2020
-
[8]
Schillings, Claudia and Sprungk, Bj\". On the convergence of the. Numer. Math. , FJOURNAL =. 2020 , NUMBER =
work page 2020
- [9]
-
[10]
Journal of the American Statistical Association , volume=
Local adaptive importance sampling for multivariate densities with strong nonlinear relationships , author=. Journal of the American Statistical Association , volume=. 1996 , publisher=
work page 1996
-
[11]
Kernel density estimation with a Markov chain Monte Carlo sample , journal =. 2026 , issn =. doi:10.1016/j.csda.2025.108271 , author =
-
[12]
The Journal of the Operational Research Society , author =
Discrete. The Journal of the Operational Research Society , author =. 1994 , pages =. doi:10.2307/2584023 , language =
-
[13]
and Law, Kody and Stuart, Andrew M
Iglesias, Marco A. and Law, Kody and Stuart, Andrew M. , title=. Inverse Problems , volume=
-
[14]
Goodman, Jonathan and Weare, Jonathan , doi =. Comm. App. Math. and Comp. Sci. , volumne =. 2010 , title =
work page 2010
-
[15]
Rudolf, Daniel and Sprungk, Bj. Robust random walk-like. arXiv:2202.12127 , year =
-
[16]
J. Andr. Bayesian Analysis , number =. 2010 , doi =
work page 2010
-
[17]
Proceedings of the 30th International Conference on Neural Information Processing Systems , pages =
Liu, Qiang and Wang, Dilin , title =. Proceedings of the 30th International Conference on Neural Information Processing Systems , pages =. 2016 , isbn =
work page 2016
-
[18]
R. Pinnau and C. Totzeck and O. Tse and S. Martin. A consensus-based model for global optimization and its mean-field limit. Mathematical Models & Methods in Applied Sciences. 2017. doi:10.1142/S0218202517400061
-
[19]
The E nsemble K alman Filter: theoretical formulation and practical implementation
Evensen, Geir. The E nsemble K alman Filter: theoretical formulation and practical implementation. Ocean Dynamics. 2003. doi:10.1007/s10236-003-0036-9
- [20]
-
[21]
Kroese, Dirk P. and Rubinstein, Reuven Y. and Glynn, Peter W. , year =. The. Handbook of. doi:10.1016/B978-0-444-53859-8.00002-3 , pages =
-
[22]
INFORMS Journal on Computing , author =
A. INFORMS Journal on Computing , author =. 2007 , pages =. doi:10.1287/ijoc.1060.0176 , language =
-
[23]
Chan, Joshua C C and Glynn, Peter W and Kroese, Dirk P , pages =. A
-
[24]
Operations Research Letters , author =
Convergence properties of the cross-entropy method for discrete optimization , volume =. Operations Research Letters , author =. 2007 , pages =. doi:10.1016/j.orl.2006.11.005 , abstract =
-
[25]
Dambreville, Frederic , pages =. Cross-
-
[26]
Rare event estimation for static models via cross-entropy and importance sampling , author =
-
[27]
SIAM Journal on Scientific Computing , author =
Active subspace methods in theory and practice: applications to kriging surfaces , volume =. SIAM Journal on Scientific Computing , author =. 2014 , note =. doi:10.1137/130916138 , abstract =
-
[28]
The Annals of Applied Probability , author =
The sample size required in importance sampling , volume =. The Annals of Applied Probability , author =. 2018 , keywords =. doi:10.1214/17-AAP1326 , language =
-
[29]
Chatfield, Christopher and Collins, Alexander J. , year =. Introduction to. doi:10.1007/978-1-4899-3184-9 , keywords =
- [30]
-
[31]
Safety and Reliability , author =
Improved cross entropy-based importance sampling for network reliability assessment , abstract =. Safety and Reliability , author =. 2021 , keywords =
work page 2021
-
[32]
Reliability-oriented sensitivity analysis under probabilistic model uncertainty –
Chabridon, Vincent , keywords =. Reliability-oriented sensitivity analysis under probabilistic model uncertainty –
-
[33]
Statistics and Computing , author =
Sequential. Statistics and Computing , author =. 2012 , keywords =. doi:10.1007/s11222-011-9231-6 , abstract =
-
[34]
Probabilistic Engineering Mechanics , author =
Estimation of small failure probabilities in high dimensions by subset simulation , volume =. Probabilistic Engineering Mechanics , author =. 2001 , keywords =. doi:10.1016/S0266-8920(01)00019-4 , abstract =
-
[35]
Digital Signal Processing , author =
Adaptive importance sampling in signal processing , volume =. Digital Signal Processing , author =. 2015 , keywords =. doi:10.1016/j.dsp.2015.05.014 , language =
-
[36]
IEEE Signal Processing Magazine , author =
Adaptive. IEEE Signal Processing Magazine , author =. 2017 , keywords =. doi:10.1109/MSP.2017.2699226 , language =
-
[37]
Journal of Computational and Graphical Statistics , author =
Population monte carlo , volume =. Journal of Computational and Graphical Statistics , author =. 2004 , note =. doi:10.1198/106186004X12803 , number =
-
[38]
Reliability Engineering & System Safety , author =
A new uncertainty importance measure , volume =. Reliability Engineering & System Safety , author =. 2007 , keywords =. doi:10.1016/j.ress.2006.04.015 , abstract =
- [39]
-
[40]
Curse-of-dimensionality revisited:
Bengtsson, Thomas and Bickel, Peter and Li, Bo , year =. Curse-of-dimensionality revisited:. Institute of. doi:10.1214/193940307000000518 , keywords =
-
[41]
Agapiou, S. and Papaspiliopoulos, O. and Sanz-Alonso, D. and Stuart, A. M. , month = jan, year =. Importance
-
[42]
Important sampling in high dimensions , volume =. Structural Safety , author =. 2003 , keywords =. doi:10.1016/S0167-4730(02)00047-4 , abstract =
-
[43]
Au, Siu-Kui , year =. On the
- [44]
-
[45]
Mathematics of Computation , author =
On some inequalities for the gamma and psi functions , volume =. Mathematics of Computation , author =. 1997 , keywords =. doi:10.1090/S0025-5718-97-00807-7 , abstract =
-
[46]
Stochastics and Stochastic Reports , author =
Importance. Stochastics and Stochastic Reports , author =. 2004 , pages =. doi:10.1080/10451120410001733845 , language =
-
[47]
The Annals of Statistics , author =
Empirical-likelihood-based confidence interval for the mean with a heavy-tailed distribution , volume =. The Annals of Statistics , author =. doi:10.1214/009053604000000328 , language =
-
[48]
Estimation of cosmological parameters using adaptive importance sampling , volume =. Physical Review D , author =. 2018 , note =. doi:10.1103/PhysRevD.80.023507 , abstract =
-
[49]
Curse-of-dimensionality revisited:
Li, Bo and Bengtsson, Thomas and Bickel, Peter , pages =. Curse-of-dimensionality revisited:
-
[50]
Nonasymptotic bounds for suboptimal importance sampling , url =
Hartmann, Carsten and Richter, Lorenz , month = feb, year =. Nonasymptotic bounds for suboptimal importance sampling , url =
-
[51]
and Mannor, Shie and Rubinstein, Reuven Y
A. Annals of Operations Research , author =. 2005 , pages =. doi:10.1007/s10479-005-5724-z , abstract =
- [52]
-
[53]
Journal of the American statistical association , author =
Sequential imputations and. Journal of the American statistical association , author =. 1994 , keywords =
work page 1994
-
[54]
Sequential importance sampling for structural reliability analysis , journal =
Iason Papaioannou and Costas Papadimitriou and Daniel Straub , keywords =. Sequential importance sampling for structural reliability analysis , journal =. 2016 , issn =. doi:https://doi.org/10.1016/j.strusafe.2016.06.002 , url =
- [55]
-
[56]
Rubinstein, Reuven Y. and Kroese, Dirk P. , TITLE =. 2004 , PAGES =
work page 2004
-
[57]
Optimal projection to improve parametric importance sampling in high dimension , author=. 2022 , eprint=
work page 2022
-
[58]
Annealed importance sampling , abstract =
Neal, Radford M , year =. Annealed importance sampling , abstract =
-
[59]
Owen, Art , year =. Monte
-
[60]
National Bureau of Standards applied mathematics series , author =
Estimation of particle transmission by random sampling , volume =. National Bureau of Standards applied mathematics series , author =. 1951 , note =
work page 1951
-
[61]
A benchmark study on importance sampling techniques in structural reliability , abstract =
Engelund, S and Rackwitz, R , year =. A benchmark study on importance sampling techniques in structural reliability , abstract =
-
[62]
Rubinstein, Reuven , year =. The
-
[63]
Certified dimension reduction in nonlinear
Zahm, Olivier and Cui, Tiangang and Law, Kody and Spantini, Alessio and Marzouk, Youssef , month = jan, year =. Certified dimension reduction in nonlinear
- [64]
-
[65]
Expert Systems With Applications , author =
Fight sample degeneracy and impoverishment in particle filters:. Expert Systems With Applications , author =. 2014 , pages =
work page 2014
- [66]
-
[67]
Hesterberg, Timothy Classen , year =
-
[68]
Mardia, Kantilal V. and Kent, John T. and Bibby, John M. , year =. Multivariate analysis , isbn =
- [69]
-
[70]
Statistical Science , author =
Generalized. Statistical Science , author =. doi:10.1214/18-STS668 , abstract =
-
[71]
Journal of Computational and Graphical Statistics , author =
Truncated. Journal of Computational and Graphical Statistics , author =. 2008 , pages =. doi:10.1198/106186008X320456 , language =
- [72]
- [73]
- [74]
-
[75]
Goertzel, Gerald , year =. A
-
[76]
Journal of Statistical Computation and Simulation , author =
Adaptive importance sampling in monte carlo integration , volume =. Journal of Statistical Computation and Simulation , author =. 1992 , pages =. doi:10.1080/00949659208810398 , language =
-
[77]
Bayesian. Econometrica , author =. 1989 , pages =. doi:10.2307/1913710 , language =
-
[78]
Bayesian. Econometrica , author =. 1978 , pages =. doi:10.2307/1913641 , language =
-
[79]
On the. Annals of Operations Research , author =. 2005 , pages =. doi:10.1007/s10479-005-5731-0 , abstract =
-
[80]
Journal of Statistical Planning and Inference , author =
Random matrix theory in statistics:. Journal of Statistical Planning and Inference , author =. 2014 , pages =. doi:10.1016/j.jspi.2013.09.005 , abstract =
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.