Normal approximations in nonparametric empirical Bayes
Pith reviewed 2026-06-28 19:47 UTC · model grok-4.3
The pith
The denoising regret of nonparametric empirical Bayes estimators is controlled by the exact-normality rate plus the average marginal CLT error.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The denoising regret of the nonparametric maximum likelihood estimator (NPMLE) and related sieve methods is controlled by the rate attained under exact normality, plus a term reflecting the quality of the CLT approximation. The CLT need only hold marginally for each coordinate, and moreover only on average, without needing high-dimensional normal approximations. We identify two asymptotic regimes in which the normal approximation is adequate and the empirical Bayesian prior remains informative, and we show that our guarantees are robust to dependence and to variance estimation.
What carries the argument
A regret decomposition that isolates the contribution of the normality assumption from the error in the average marginal central limit theorem approximation.
Load-bearing premise
The central limit theorem holds marginally for each coordinate on average, so that the regret can be split into a normality term and a separate approximation term.
What would settle it
An example where the average marginal central limit theorem errors are small yet the NPMLE regret exceeds the exact-normality regret by substantially more than those errors.
read the original abstract
Empirical Bayes analyses routinely model noisy measurements of latent parameters as normal, justifying this by an informal appeal to the central limit theorem (CLT). This paper puts this heuristic appeal on firmer analytical grounds. We show that the denoising regret of the nonparametric maximum likelihood estimator (NPMLE) and related sieve methods is controlled by the rate attained under exact normality, plus a term reflecting the quality of the CLT approximation. The CLT need only hold marginally for each coordinate, and moreover only on average, without needing high-dimensional normal approximations. We identify two asymptotic regimes in which the normal approximation is adequate and the empirical Bayesian prior remains informative, and we show that our guarantees are robust to dependence and to variance estimation.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript claims to put the heuristic appeal to the central limit theorem (CLT) in empirical Bayes analyses on firmer analytical grounds. It shows that the denoising regret of the nonparametric maximum likelihood estimator (NPMLE) and related sieve methods is controlled by the rate attained under exact normality, plus a term reflecting the quality of the CLT approximation. The CLT need only hold marginally for each coordinate and on average, without high-dimensional normal approximations. Two asymptotic regimes are identified where the normal approximation is adequate and the empirical Bayesian prior remains informative. Guarantees are robust to dependence and variance estimation.
Significance. If the results hold, this work supplies a rigorous justification for the routine use of normal models in nonparametric empirical Bayes settings under marginal CLT conditions. It could expand the scope of EB methods to approximately normal data while preserving the nonparametric prior's informativeness in identified regimes. The robustness claims to dependence and variance estimation are potentially valuable for applied work in mathematical statistics.
major comments (1)
- [Abstract] Abstract: the central claim that denoising regret is controlled by the exact-normal rate plus a marginal-CLT term cannot be assessed, as the full derivation, the precise definition of the two asymptotic regimes, and the separation of the normality error from the approximation term are not provided.
Simulated Author's Rebuttal
We thank the referee for their comments on our manuscript. We respond to the major comment as follows.
read point-by-point responses
-
Referee: [Abstract] Abstract: the central claim that denoising regret is controlled by the exact-normal rate plus a marginal-CLT term cannot be assessed, as the full derivation, the precise definition of the two asymptotic regimes, and the separation of the normality error from the approximation term are not provided.
Authors: The abstract is a high-level summary of the paper's contributions and is not intended to contain the full technical details. The complete derivation of the bound on the denoising regret of the NPMLE and sieve methods, the precise definitions of the two asymptotic regimes in which the normal approximation is adequate while keeping the prior informative, and the explicit separation of the exact normality rate from the marginal CLT approximation error are all provided in the main body of the manuscript. These elements allow the central claim to be assessed based on the full paper. revision: no
Circularity Check
No circularity detectable from available text
full rationale
Only the abstract is provided, which states that denoising regret is controlled by the exact-normality rate plus a CLT-approximation term, with the CLT required only marginally and on average. No equations, derivations, or self-citations appear in the text. Without any load-bearing steps, fitted parameters, or self-referential definitions visible, the claim cannot be shown to reduce to its inputs by construction. The result is therefore treated as resting on external CLT assumptions rather than internal circularity.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Abadie, A., Agarwal, A., Imbens, G., Jia, S., McQueen, J., and Stepaniants, S. (2023). Esti- mating the value of evidence-based decision making.arXiv preprint, arXiv:2306.13681. Angrist, J. D., Hull, P. D., Pathak, P. A., and Walters, C. R. (2017). Leveraging lotteries for school value-added: Testing and estimation.The Quarterly Journal of Economics, 132(...
-
[2]
invidious comparisons: Ranking and selection as com- pound decisions
Dedecker, J. and Michel, B. (2013). Minimax rates of convergence for Wasserstein deconvo- lution with supersmooth errors in any dimension.J. Multivariate Anal., 122:278–291. Deng, A., Li, Y., Lu, J., and Ramamurthy, V. (2021). On post-selection inference in A/B testing. InProceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, ...
-
[3]
Confidence intervals for nonparametric empirical Bayes analysis
Ignatiadis, N. and Wager, S. (2022). Confidence intervals for nonparametric empirical Bayes analysis (with discussion and a rejoinder by the authors).Journal of the American Statis- tical Association, 117(539):1149–1166. Imbens, G. (2022). Comment on: “Confidence intervals for nonparametric empirical Bayes analysis” by Ignatiadis and Wager.Journal of the ...
2022
-
[4]
Empirical Bayes Estimation and Inference via Smooth Nonparametric Maximum Likelihood
Jiang, W. and Zhang, C.-H. (2009). General maximum likelihood empirical Bayes estimation of normal means.The Annals of Statistics, 37(4):1647–1684. Kaji, T. (2026). The Hellinger bounds on the Kullback–Leibler divergence and the Bernstein norm.The Japanese Economic Review, pages 1–22. Karmakar, R., Heller, R., and Rosset, S. (2025). Inference with approxi...
work page internal anchor Pith review Pith/arXiv arXiv 2009
-
[5]
Poverty Targeting with Imperfect Information
Cambridge University Press. Walters, C. R. (2024). Empirical Bayes methods in labor economics. Technical report, National Bureau of Economic Research. Chen, Deb, and Ignatiadis/Normal approximations in nonparametric empirical Bayes19 Wernerfelt, N., Tuchman, A., Shapiro, B. T., and Moakler, R. (2025). Estimating the value of offsite tracking data to adver...
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[6]
In particular, ∥X ′ n,t∥∞ ≤ 1 2t p log(n) ∥ψ′∥∞ and∥X ′′ n,t∥∞ ≤ 1 4t2 log (n)∥ψ′′∥∞
AsX n,t(x) is identically 1 in a neighborhood ofx= 0, we also have thatX n,t is infinitely differentiable. In particular, ∥X ′ n,t∥∞ ≤ 1 2t p log(n) ∥ψ′∥∞ and∥X ′′ n,t∥∞ ≤ 1 4t2 log (n)∥ψ′′∥∞. Our proof strategy involves truncating all theX i’s within the intervalS n,t. To wit, note that as ˆπn satisfies (2.2), we have: P 1 n nX i=1 Hel2(fˆπn,·, f π⋆,·)≥t...
2020
-
[7]
Note that for anyf π,·(·)∈ F ℓ(t), by the definition of a covering set, we have the existence of somejsuch that∥f π,· −f πj ,·∥∞, ˜Sn,t ≤η
that there exists a constantCdepending onM, k, K, s, andtsuch that logN≤Clog 2 n.(A.3) For anyℓ≥0, letJ ℓ ⊆ {j: 1≤j≤N}be the subset of alljfor which there exist fπ0,j ,· ∈ FGauss,σ(n) satisfying ∥fπ0,j ,· −f πj ,·∥∞, ˜Sn,t ≤ηand 1 n nX i=1 Hel2(fπ0,j ,σi, f π⋆,σi)≥2 2ℓt2δ2 n.(A.4) By (A.3), we note that sup ℓ≥0 log|J ℓ| ≤Clog 2 n. Note that for anyf π,·(·...
2018
-
[8]
As a result, we have for anynlarge enough max 1≤i≤n max 1≤j≤N ∥A(n) i,j ∥∞ ≲t 2 logn.(A.7) Bound on the mean. We note the following identity for anyj∈J ℓ: 1 n nX i=1 EA (n) i,j (Xi) = 1 n nX i=1 Z x∈ ˜Sn,t log fπ0,j ,σi(x) fπ⋆,σi(x) dµi(x) Chen, Deb, and Ignatiadis/Normal approximations in nonparametric empirical Bayes23 =− 1 n nX i=1 Z x∈ ˜Sc n,t log fπ0...
2013
-
[9]
Bounds for the usual Kullback-Leibler variation have been studied extensively (see Wong and Shen (1995); Kaji (2026))
to the usualk-th order Kullback-Leibler variation given by Vk(p1∥p2) = Z p1 log p1 p2 k . Bounds for the usual Kullback-Leibler variation have been studied extensively (see Wong and Shen (1995); Kaji (2026)). Our subsequent bounds can be viewed as extensions of these existing results to the reweighted setting. Our first result studies a bound onV k(p1∥p2)...
1995
-
[10]
Related bounds appear in Kaji (2026, Theorem 2 and Proposition
2026
-
[11]
We present a version tailored to the setting of our paper
and Wong and Shen (1995, Theorem 5); the proof techniques are similar. We present a version tailored to the setting of our paper. Lemma A.6.Suppose there exists someδ∈(0,1)such that Mδ := Z p1 p1 p2 δ <∞. Then we have Vk(p1∥p2)≤10δ −k Hel2(p1, p2) " Ck,δ + log Mδ 5 Hel2(p1, p2) k# for some constantC k,δ >0, providedk≥2. In our next result, we provide a bo...
1995
-
[12]
(A.30) Let us bound the terms inside the square root
Applying Lemma A.8, we obtain EX∼µ iKπ,i(X)≤ √ 2 Hel(qi, p2,i) sZ qi log2 p1,i p2,i + Z p2,i log2 p1,i p2,i −Hel 2(p1,i, p2,i). (A.30) Let us bound the terms inside the square root. From Lemma A.7 (already applied above) and Lemma A.6 withk= 2: Z qi log2 p1,i p2,i ≲g T (Hel2 qi, p2,i +g T Hel2(p1,i, p2,i) , Z p2,i log2 p1,i p2,i ≲g T Hel2(p1,i, p2,i) , wh...
2020
-
[13]
Factoring the Gaussian kernel, ϕσ(x−θ) = 1√ 2π σ exp − x2 2σ2 exp xθ σ2 exp − θ2 2σ2 , so thatf π,σ(x) = 1√ 2π σ e−x2/(2σ2)fM(x/σ 2), where fM(t) := R etθ d˜π(θ)
density (withϕdenoting the standard normal with a slight notational abuse). Factoring the Gaussian kernel, ϕσ(x−θ) = 1√ 2π σ exp − x2 2σ2 exp xθ σ2 exp − θ2 2σ2 , so thatf π,σ(x) = 1√ 2π σ e−x2/(2σ2)fM(x/σ 2), where fM(t) := R etθ d˜π(θ). HerefM(t) can be viewed as the moment-generating function of the tilted measured˜π(θ)∝e −θ2/(2σ2) dπ(θ) up to some nor...
1995
-
[14]
1 n nX i=1 f ′ i(Xi)2 # →0 (C.9) asn→ ∞. The last limit follows from the fact thatn −1∥Σn∥op →0 and E
Bound onW n.The bound onW n follows from Bobkov (2018, Theorem 1.3). Proof of Corollary 4.1.We only need to verify Assumptions 3.1 and 3.2. The only non-trivial condition to check is that for allf i :R→Rsuch that∥f ′′ i ∥∞ ≤1 for all 1≤i≤n, we have 1 n nX i=1 fi(Xi)−Ef i(Xi) p − →0.(C.7) Chen, Deb, and Ignatiadis/Normal approximations in nonparametric emp...
2018
-
[15]
Next we note that log(x)≤2( √x−1) forx≥0
log2 p1 p2 1/2 + Z p2 log p1 p2 = √ 2 Hel(q, p2) p V2(q;p 1, p2) +V 2(p2∥p1) + Z p2 log p1 p2 . Next we note that log(x)≤2( √x−1) forx≥0. As a result, Z p2 log p1 p2 ≤2 Z p2 r p1 p2 −1 =−2 Z (1− √p1p2) =−2Hel 2(p1, p2). This completes the proof. Appendix E: Proof of Auxiliary results from Section B Proof of Lemma B.1.Assume without loss of generality that...
2013
-
[16]
The normality assumption forz f can be justified by an asymptotic approximation with a growing number of jobs sampled for each firm
The last inequality follows from Lemma B.2. This completes the proof. Appendix F: Examples of heuristic appeals to the CLT in empirical Bayes F.1. Papers in economics •Gu and Shen (2018): “In this case, the statisticS i follows a normal mixture distribution only asymptotically under the null and alternative (cf. Cao and Kosorok, 2011), and therefore the m...
2018
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.