pith. sign in

arxiv: 2605.31599 · v1 · pith:U6G26WHWnew · submitted 2026-05-29 · 🧮 math.ST · stat.TH

Normal approximations in nonparametric empirical Bayes

Pith reviewed 2026-06-28 19:47 UTC · model grok-4.3

classification 🧮 math.ST stat.TH
keywords empirical Bayesnonparametric MLEcentral limit theoremdenoising regretnormal approximationsieve methodsasymptotic regimes
0
0 comments X

The pith

The denoising regret of nonparametric empirical Bayes estimators is controlled by the exact-normality rate plus the average marginal CLT error.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Empirical Bayes analyses often model noisy measurements as normal by appealing to the central limit theorem. This paper shows that the regret of the nonparametric maximum likelihood estimator and related sieve methods stays close to the rate achieved under exact normality, as long as the central limit theorem holds marginally for each coordinate and only on average. It identifies two asymptotic regimes where the approximation remains good and the estimated prior stays informative, while establishing robustness to dependence between observations and to estimated variances. A sympathetic reader would care because the result justifies a common modeling choice with substantially weaker conditions than joint high-dimensional normality.

Core claim

The denoising regret of the nonparametric maximum likelihood estimator (NPMLE) and related sieve methods is controlled by the rate attained under exact normality, plus a term reflecting the quality of the CLT approximation. The CLT need only hold marginally for each coordinate, and moreover only on average, without needing high-dimensional normal approximations. We identify two asymptotic regimes in which the normal approximation is adequate and the empirical Bayesian prior remains informative, and we show that our guarantees are robust to dependence and to variance estimation.

What carries the argument

A regret decomposition that isolates the contribution of the normality assumption from the error in the average marginal central limit theorem approximation.

Load-bearing premise

The central limit theorem holds marginally for each coordinate on average, so that the regret can be split into a normality term and a separate approximation term.

What would settle it

An example where the average marginal central limit theorem errors are small yet the NPMLE regret exceeds the exact-normality regret by substantially more than those errors.

read the original abstract

Empirical Bayes analyses routinely model noisy measurements of latent parameters as normal, justifying this by an informal appeal to the central limit theorem (CLT). This paper puts this heuristic appeal on firmer analytical grounds. We show that the denoising regret of the nonparametric maximum likelihood estimator (NPMLE) and related sieve methods is controlled by the rate attained under exact normality, plus a term reflecting the quality of the CLT approximation. The CLT need only hold marginally for each coordinate, and moreover only on average, without needing high-dimensional normal approximations. We identify two asymptotic regimes in which the normal approximation is adequate and the empirical Bayesian prior remains informative, and we show that our guarantees are robust to dependence and to variance estimation.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 0 minor

Summary. The manuscript claims to put the heuristic appeal to the central limit theorem (CLT) in empirical Bayes analyses on firmer analytical grounds. It shows that the denoising regret of the nonparametric maximum likelihood estimator (NPMLE) and related sieve methods is controlled by the rate attained under exact normality, plus a term reflecting the quality of the CLT approximation. The CLT need only hold marginally for each coordinate and on average, without high-dimensional normal approximations. Two asymptotic regimes are identified where the normal approximation is adequate and the empirical Bayesian prior remains informative. Guarantees are robust to dependence and variance estimation.

Significance. If the results hold, this work supplies a rigorous justification for the routine use of normal models in nonparametric empirical Bayes settings under marginal CLT conditions. It could expand the scope of EB methods to approximately normal data while preserving the nonparametric prior's informativeness in identified regimes. The robustness claims to dependence and variance estimation are potentially valuable for applied work in mathematical statistics.

major comments (1)
  1. [Abstract] Abstract: the central claim that denoising regret is controlled by the exact-normal rate plus a marginal-CLT term cannot be assessed, as the full derivation, the precise definition of the two asymptotic regimes, and the separation of the normality error from the approximation term are not provided.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their comments on our manuscript. We respond to the major comment as follows.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the central claim that denoising regret is controlled by the exact-normal rate plus a marginal-CLT term cannot be assessed, as the full derivation, the precise definition of the two asymptotic regimes, and the separation of the normality error from the approximation term are not provided.

    Authors: The abstract is a high-level summary of the paper's contributions and is not intended to contain the full technical details. The complete derivation of the bound on the denoising regret of the NPMLE and sieve methods, the precise definitions of the two asymptotic regimes in which the normal approximation is adequate while keeping the prior informative, and the explicit separation of the exact normality rate from the marginal CLT approximation error are all provided in the main body of the manuscript. These elements allow the central claim to be assessed based on the full paper. revision: no

Circularity Check

0 steps flagged

No circularity detectable from available text

full rationale

Only the abstract is provided, which states that denoising regret is controlled by the exact-normality rate plus a CLT-approximation term, with the CLT required only marginally and on average. No equations, derivations, or self-citations appear in the text. Without any load-bearing steps, fitted parameters, or self-referential definitions visible, the claim cannot be shown to reduce to its inputs by construction. The result is therefore treated as resting on external CLT assumptions rather than internal circularity.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review supplies no explicit free parameters, axioms, or invented entities; full paper would be needed to audit any hidden modeling choices or regime-specific assumptions.

pith-pipeline@v0.9.1-grok · 5614 in / 1132 out tokens · 19371 ms · 2026-06-28T19:47:55.249271+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

16 extracted references · 4 canonical work pages · 2 internal anchors

  1. [1]

    Abadie, A., Agarwal, A., Imbens, G., Jia, S., McQueen, J., and Stepaniants, S. (2023). Esti- mating the value of evidence-based decision making.arXiv preprint, arXiv:2306.13681. Angrist, J. D., Hull, P. D., Pathak, P. A., and Walters, C. R. (2017). Leveraging lotteries for school value-added: Testing and estimation.The Quarterly Journal of Economics, 132(...

  2. [2]

    invidious comparisons: Ranking and selection as com- pound decisions

    Dedecker, J. and Michel, B. (2013). Minimax rates of convergence for Wasserstein deconvo- lution with supersmooth errors in any dimension.J. Multivariate Anal., 122:278–291. Deng, A., Li, Y., Lu, J., and Ramamurthy, V. (2021). On post-selection inference in A/B testing. InProceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, ...

  3. [3]

    Confidence intervals for nonparametric empirical Bayes analysis

    Ignatiadis, N. and Wager, S. (2022). Confidence intervals for nonparametric empirical Bayes analysis (with discussion and a rejoinder by the authors).Journal of the American Statis- tical Association, 117(539):1149–1166. Imbens, G. (2022). Comment on: “Confidence intervals for nonparametric empirical Bayes analysis” by Ignatiadis and Wager.Journal of the ...

  4. [4]

    Empirical Bayes Estimation and Inference via Smooth Nonparametric Maximum Likelihood

    Jiang, W. and Zhang, C.-H. (2009). General maximum likelihood empirical Bayes estimation of normal means.The Annals of Statistics, 37(4):1647–1684. Kaji, T. (2026). The Hellinger bounds on the Kullback–Leibler divergence and the Bernstein norm.The Japanese Economic Review, pages 1–22. Karmakar, R., Heller, R., and Rosset, S. (2025). Inference with approxi...

  5. [5]

    Poverty Targeting with Imperfect Information

    Cambridge University Press. Walters, C. R. (2024). Empirical Bayes methods in labor economics. Technical report, National Bureau of Economic Research. Chen, Deb, and Ignatiadis/Normal approximations in nonparametric empirical Bayes19 Wernerfelt, N., Tuchman, A., Shapiro, B. T., and Moakler, R. (2025). Estimating the value of offsite tracking data to adver...

  6. [6]

    In particular, ∥X ′ n,t∥∞ ≤ 1 2t p log(n) ∥ψ′∥∞ and∥X ′′ n,t∥∞ ≤ 1 4t2 log (n)∥ψ′′∥∞

    AsX n,t(x) is identically 1 in a neighborhood ofx= 0, we also have thatX n,t is infinitely differentiable. In particular, ∥X ′ n,t∥∞ ≤ 1 2t p log(n) ∥ψ′∥∞ and∥X ′′ n,t∥∞ ≤ 1 4t2 log (n)∥ψ′′∥∞. Our proof strategy involves truncating all theX i’s within the intervalS n,t. To wit, note that as ˆπn satisfies (2.2), we have: P 1 n nX i=1 Hel2(fˆπn,·, f π⋆,·)≥t...

  7. [7]

    Note that for anyf π,·(·)∈ F ℓ(t), by the definition of a covering set, we have the existence of somejsuch that∥f π,· −f πj ,·∥∞, ˜Sn,t ≤η

    that there exists a constantCdepending onM, k, K, s, andtsuch that logN≤Clog 2 n.(A.3) For anyℓ≥0, letJ ℓ ⊆ {j: 1≤j≤N}be the subset of alljfor which there exist fπ0,j ,· ∈ FGauss,σ(n) satisfying ∥fπ0,j ,· −f πj ,·∥∞, ˜Sn,t ≤ηand 1 n nX i=1 Hel2(fπ0,j ,σi, f π⋆,σi)≥2 2ℓt2δ2 n.(A.4) By (A.3), we note that sup ℓ≥0 log|J ℓ| ≤Clog 2 n. Note that for anyf π,·(·...

  8. [8]

    As a result, we have for anynlarge enough max 1≤i≤n max 1≤j≤N ∥A(n) i,j ∥∞ ≲t 2 logn.(A.7) Bound on the mean. We note the following identity for anyj∈J ℓ: 1 n nX i=1 EA (n) i,j (Xi) = 1 n nX i=1 Z x∈ ˜Sn,t log fπ0,j ,σi(x) fπ⋆,σi(x) dµi(x) Chen, Deb, and Ignatiadis/Normal approximations in nonparametric empirical Bayes23 =− 1 n nX i=1 Z x∈ ˜Sc n,t log fπ0...

  9. [9]

    Bounds for the usual Kullback-Leibler variation have been studied extensively (see Wong and Shen (1995); Kaji (2026))

    to the usualk-th order Kullback-Leibler variation given by Vk(p1∥p2) = Z p1 log p1 p2 k . Bounds for the usual Kullback-Leibler variation have been studied extensively (see Wong and Shen (1995); Kaji (2026)). Our subsequent bounds can be viewed as extensions of these existing results to the reweighted setting. Our first result studies a bound onV k(p1∥p2)...

  10. [10]

    Related bounds appear in Kaji (2026, Theorem 2 and Proposition

  11. [11]

    We present a version tailored to the setting of our paper

    and Wong and Shen (1995, Theorem 5); the proof techniques are similar. We present a version tailored to the setting of our paper. Lemma A.6.Suppose there exists someδ∈(0,1)such that Mδ := Z p1 p1 p2 δ <∞. Then we have Vk(p1∥p2)≤10δ −k Hel2(p1, p2) " Ck,δ + log Mδ 5 Hel2(p1, p2) k# for some constantC k,δ >0, providedk≥2. In our next result, we provide a bo...

  12. [12]

    (A.30) Let us bound the terms inside the square root

    Applying Lemma A.8, we obtain EX∼µ iKπ,i(X)≤ √ 2 Hel(qi, p2,i) sZ qi log2 p1,i p2,i + Z p2,i log2 p1,i p2,i −Hel 2(p1,i, p2,i). (A.30) Let us bound the terms inside the square root. From Lemma A.7 (already applied above) and Lemma A.6 withk= 2: Z qi log2 p1,i p2,i ≲g T (Hel2 qi, p2,i +g T Hel2(p1,i, p2,i) , Z p2,i log2 p1,i p2,i ≲g T Hel2(p1,i, p2,i) , wh...

  13. [13]

    Factoring the Gaussian kernel, ϕσ(x−θ) = 1√ 2π σ exp − x2 2σ2 exp xθ σ2 exp − θ2 2σ2 , so thatf π,σ(x) = 1√ 2π σ e−x2/(2σ2)fM(x/σ 2), where fM(t) := R etθ d˜π(θ)

    density (withϕdenoting the standard normal with a slight notational abuse). Factoring the Gaussian kernel, ϕσ(x−θ) = 1√ 2π σ exp − x2 2σ2 exp xθ σ2 exp − θ2 2σ2 , so thatf π,σ(x) = 1√ 2π σ e−x2/(2σ2)fM(x/σ 2), where fM(t) := R etθ d˜π(θ). HerefM(t) can be viewed as the moment-generating function of the tilted measured˜π(θ)∝e −θ2/(2σ2) dπ(θ) up to some nor...

  14. [14]

    1 n nX i=1 f ′ i(Xi)2 # →0 (C.9) asn→ ∞. The last limit follows from the fact thatn −1∥Σn∥op →0 and E

    Bound onW n.The bound onW n follows from Bobkov (2018, Theorem 1.3). Proof of Corollary 4.1.We only need to verify Assumptions 3.1 and 3.2. The only non-trivial condition to check is that for allf i :R→Rsuch that∥f ′′ i ∥∞ ≤1 for all 1≤i≤n, we have 1 n nX i=1 fi(Xi)−Ef i(Xi) p − →0.(C.7) Chen, Deb, and Ignatiadis/Normal approximations in nonparametric emp...

  15. [15]

    Next we note that log(x)≤2( √x−1) forx≥0

    log2 p1 p2 1/2 + Z p2 log p1 p2 = √ 2 Hel(q, p2) p V2(q;p 1, p2) +V 2(p2∥p1) + Z p2 log p1 p2 . Next we note that log(x)≤2( √x−1) forx≥0. As a result, Z p2 log p1 p2 ≤2 Z p2 r p1 p2 −1 =−2 Z (1− √p1p2) =−2Hel 2(p1, p2). This completes the proof. Appendix E: Proof of Auxiliary results from Section B Proof of Lemma B.1.Assume without loss of generality that...

  16. [16]

    The normality assumption forz f can be justified by an asymptotic approximation with a growing number of jobs sampled for each firm

    The last inequality follows from Lemma B.2. This completes the proof. Appendix F: Examples of heuristic appeals to the CLT in empirical Bayes F.1. Papers in economics •Gu and Shen (2018): “In this case, the statisticS i follows a normal mixture distribution only asymptotically under the null and alternative (cf. Cao and Kosorok, 2011), and therefore the m...