arxiv: 2605.03264 · v1 · submitted 2026-05-05 · 📊 stat.ME

Recognition: unknown

Efficient Proposal-Test-Release for Minimax Optimal Estimation

Tao Shen , Xin T. Tong , Wanjie Wang

Authors on Pith no claims yet

Pith reviewed 2026-05-07 14:57 UTC · model grok-4.3

classification 📊 stat.ME

keywords differential privacyproposal-test-releaseminimax estimationlinear regressionnonparametric regressionBayes classificationsensitivity analysis

0 comments

The pith

Efficient PTR replaces the exact insensitive set with a simpler subset and the exact Hellinger distance with a Lipschitz lower bound, yielding tractable DP mechanisms that attain minimax rates for classification and regression.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Differential privacy demands noise calibrated to worst-case sensitivity over all datasets, yet many estimators suffer high sensitivity only on atypical data. Classical Proposal-Test-Release tries to detect and down-weight those cases but requires computing an exact insensitive set and precise Hellinger distance, both intractable in practice. Efficient PTR relaxes both requirements: it substitutes a simpler insensitive subset and bounds the distance from below via a Lipschitz condition. The resulting mechanisms remain differentially private while recovering the non-private minimax rate. Concrete constructions are given for Bayes classification, linear regression, and nonparametric regression, each shown to outperform standard DP baselines in accuracy.

Core claim

We introduce efficient PTR (ePTR), which replaces the exact insensitive set with a simpler subset and the exact Hellinger distance with a Lipschitz-based lower bound. This flexibility enables substantially simpler DP mechanisms that achieve rate-optimal accuracy in many settings. To illustrate, we study basic estimators for Bayes classification, linear regression, and nonparametric regression. We show that each can have high sensitivity on atypical datasets, yet admits intuitive ePTR-based designs. Empirically and theoretically, we compare ePTR against popular DP baselines and demonstrate improved accuracy while maintaining privacy guarantees.

What carries the argument

The ePTR procedure, which proposes a tractable insensitive subset and substitutes a Lipschitz lower bound for the true Hellinger distance to decide the scale of added noise.

If this is right

Bayes classification admits an ePTR estimator whose excess risk matches the non-private minimax rate.
Linear regression under ePTR achieves the same convergence rate as ordinary least squares despite high sensitivity on atypical data.
Nonparametric regression with ePTR recovers the optimal rate without requiring exact sensitivity calculations.
In each case the added privacy noise is smaller than that required by standard Laplace or Gaussian mechanisms.
The same relaxation applies to any estimator whose sensitivity is large only on a small fraction of datasets.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The approach may extend to other M-estimators or high-dimensional procedures where exact insensitive sets are unavailable.
If the Lipschitz constant can be replaced by a data-dependent bound, even less noise may be needed in some regimes.
Practitioners could apply ePTR to existing regression or classification pipelines with only modest code changes.
Testing the method on real datasets with natural outliers would reveal whether the theoretical rate gains appear in practice.

Load-bearing premise

The simpler insensitive subset and the Lipschitz lower bound must be tight enough that the resulting noise still satisfies differential privacy and preserves the minimax rate.

What would settle it

For the linear-regression estimator, compute the worst-case excess risk over n samples and check whether it exceeds the non-private rate by more than a constant factor; any such excess falsifies the optimality claim.

Figures

Figures reproduced from arXiv: 2605.03264 by Tao Shen, Wanjie Wang, Xin T. Tong.

**Figure 1.** Figure 1: Simulation results for Bayes classification under varying privacy bud view at source ↗

**Figure 2.** Figure 2: Simulation results for linear regression under varying privacy budget view at source ↗

**Figure 3.** Figure 3: Simulation results for kernelized regression under varying privacy view at source ↗

read the original abstract

Differential privacy (DP) is a rigorous framework that protects the participation of individuals in a dataset by limiting information leakage from released estimators. This creates a challenging setting for statisticians: DP must hold uniformly over all possible datasets, whereas statistical practice often downweights atypical or rare outcomes. The conceptual challenge is especially pronounced in sensitivity analysis, the key quantity governing the magnitude of DP noise and, consequently, estimator accuracy, because many estimators, including ordinary least squares for linear regression, exhibit markedly higher sensitivity on atypical datasets. Propose-Test-Release (PTR) is designed to address such cases, but its classical implementation requires computing the exact insensitive set and the dataset's Hellinger distance to that set, both of which are typically intractable. We introduce efficient PTR (ePTR), which replaces the exact insensitive set with a simpler subset and the exact Hellinger distance with a Lipschitz-based lower bound. This flexibility enables substantially simpler DP mechanisms that achieve rate-optimal accuracy in many settings. To illustrate, we study basic estimators for Bayes classification, linear regression, and nonparametric regression. We show that each can have high sensitivity on atypical datasets, yet admits intuitive ePTR-based designs. Empirically and theoretically, we compare ePTR against popular DP baselines and demonstrate improved accuracy while maintaining privacy guarantees.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

ePTR simplifies PTR via subset and Lipschitz substitutions but the tightness of those bounds for preserving both DP and minimax rates needs explicit verification in the examples.

read the letter

The core move is replacing the exact insensitive set and Hellinger distance in classical PTR with a simpler subset plus a Lipschitz lower bound on the distance. This lets them build DP mechanisms for Bayes classification, linear regression, and nonparametric regression without the usual intractable computations, and they claim the resulting estimators still hit the right rates while beating standard DP baselines in simulations. That substitution is the actual novelty and it is presented cleanly for the three cases. The paper does a decent job laying out the constructions, showing where high sensitivity arises on atypical data, and providing both theoretical rate statements and some empirical comparisons. The math appears to follow standard sensitivity analysis without obvious circularity. The main soft spot is whether the chosen subsets and Lipschitz functions are tight enough in practice. A loose lower bound could either weaken the privacy guarantee or force extra noise that hurts the claimed optimality, and the abstract does not spell out the specific constants or subset definitions per estimator. The empirical section would be stronger with explicit error bars and clearer rules on how data points are excluded. Overall this is a targeted technical improvement for the DP statistics crowd rather than a broad advance. Readers already working on private estimation will find the concrete designs useful. It deserves peer review because the device is new enough and the applications are standard but the proofs and bound tightness need checking by referees.

Referee Report

3 major / 2 minor

Summary. The paper introduces efficient Propose-Test-Release (ePTR), which substitutes a simpler insensitive subset for the exact one and a Lipschitz-based lower bound for the exact Hellinger distance to that set. This yields simpler DP mechanisms claimed to achieve minimax-optimal rates for Bayes classification, linear regression, and nonparametric regression estimators, with supporting theory and experiments showing gains over standard DP baselines while preserving privacy.

Significance. If the lower-bound and subset choices are shown to be tight enough, the framework would simplify sensitivity handling for high-sensitivity estimators under DP and deliver practical accuracy improvements without sacrificing theoretical optimality. The explicit treatment of three canonical estimators plus empirical comparisons against baselines constitute a concrete contribution to DP statistical methodology.

major comments (3)

[§3] §3 (ePTR mechanism): the Lipschitz lower bound on Hellinger distance is used to decide release; the manuscript must prove that any release triggered by the lower bound still guarantees the true distance is large enough to control the privacy loss at the target (ε,δ), otherwise the DP guarantee may fail or force overly conservative noise. The specific Lipschitz function and constant per estimator are load-bearing and require explicit verification.
[§4.3] §4.3 (nonparametric regression): the claimed minimax rate optimality after applying the simpler subset and Lipschitz bound is not accompanied by a quantitative bound on the rate degradation induced by the approximation; without this, it is unclear whether the resulting mechanism matches the exact-PTR rate or merely matches a weaker baseline.
[§4.2] §4.2 (linear regression): the insensitive subset chosen for OLS must be shown to preserve the release probability scaling that yields the optimal rate; the current presentation leaves open whether the subset is sufficiently large that the test still triggers with the probability required for the minimax claim.

minor comments (2)

The abstract states both theoretical optimality and empirical gains but does not report the concrete rates achieved or the precise error-bar methodology used in the figures; adding these would improve clarity.
Notation for the Lipschitz constant and the simpler subset should be introduced once in §2 and used consistently across the three applications in §4.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the careful and constructive review. The comments highlight important points where the current presentation of the ePTR framework can be strengthened with additional explicit arguments. We address each major comment below and will incorporate the necessary clarifications and proofs in the revised manuscript.

read point-by-point responses

Referee: [§3] §3 (ePTR mechanism): the Lipschitz lower bound on Hellinger distance is used to decide release; the manuscript must prove that any release triggered by the lower bound still guarantees the true distance is large enough to control the privacy loss at the target (ε,δ), otherwise the DP guarantee may fail or force overly conservative noise. The specific Lipschitz function and constant per estimator are load-bearing and require explicit verification.

Authors: We agree that an explicit verification is required. In the revision we will insert a new lemma immediately after the definition of the Lipschitz lower bound in Section 3. The lemma will show that whenever the lower bound exceeds the release threshold, the true Hellinger distance to the insensitive set is at least the threshold minus a controlled additive term that can be absorbed into the (ε,δ) budget by a slight adjustment of the noise scale. We will also state and verify the Lipschitz constant explicitly for each of the three estimators. revision: yes
Referee: [§4.3] §4.3 (nonparametric regression): the claimed minimax rate optimality after applying the simpler subset and Lipschitz bound is not accompanied by a quantitative bound on the rate degradation induced by the approximation; without this, it is unclear whether the resulting mechanism matches the exact-PTR rate or merely matches a weaker baseline.

Authors: We will add a new theorem in Section 4.3 that quantifies the approximation error. The theorem will establish that the excess risk introduced by the simpler subset and the Lipschitz lower bound is bounded by a multiplicative constant (independent of sample size n) times the exact-PTR risk. Consequently the minimax rate remains unchanged up to this constant factor. revision: yes
Referee: [§4.2] §4.2 (linear regression): the insensitive subset chosen for OLS must be shown to preserve the release probability scaling that yields the optimal rate; the current presentation leaves open whether the subset is sufficiently large that the test still triggers with the probability required for the minimax claim.

Authors: We will strengthen the argument in Section 4.2 by adding a proposition that lower-bounds the probability that the chosen insensitive subset triggers release. The bound will be shown to be of the same order as the probability for the exact insensitive set, thereby preserving the scaling that yields the claimed minimax rate for linear regression. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation relies on external DP theory and explicit bounds

full rationale

The paper introduces ePTR by replacing the exact insensitive set and Hellinger distance with a simpler subset and a Lipschitz lower bound, then derives DP mechanisms and rate-optimality for three estimators via standard sensitivity analysis and concentration inequalities. No step reduces a claimed prediction or optimality result to a fitted quantity defined inside the paper, nor does any load-bearing premise rest on self-citation chains or imported uniqueness theorems. The central claims are supported by direct theoretical analysis of the proposed approximations rather than by construction or renaming of prior results.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The approach rests on standard differential-privacy axioms plus domain assumptions that an insensitive set exists and that the estimators admit Lipschitz continuity; no new invented entities are introduced.

free parameters (1)

Lipschitz constant for the distance bound
Chosen to lower-bound Hellinger distance; its value is estimator-specific and must be set for each application.

axioms (1)

domain assumption An insensitive set exists for the estimator under consideration
Invoked when defining the simpler subset that replaces the exact insensitive set.

pith-pipeline@v0.9.0 · 5525 in / 1140 out tokens · 33770 ms · 2026-05-07T14:57:56.540752+00:00 · methodology

Review history (2 revisions) →

discussion (0)

Reference graph

Works this paper leans on

29 extracted references · 5 canonical work pages

[1]

A face is exposed for aol searcher no

Michael Barbaro, Tom Zeller, and Saul Hansell. A face is exposed for aol searcher no. 4417749.New York Times, 9(2008):8, 2006

2008
[2]

Robust de-anonymization of large sparse datasets

Arvind Narayanan and Vitaly Shmatikov. Robust de-anonymization of large sparse datasets. In2008 IEEE Symposium on Security and Privacy (sp 2008), pages 111–125. IEEE, 2008

2008
[3]

Differential privacy

Cynthia Dwork. Differential privacy. InInternational colloquium on au- tomata, languages, and programming, pages 1–12. Springer, 2006

2006
[4]

The us census bureau adopts differential privacy

John M Abowd. The us census bureau adopts differential privacy. InPro- ceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining, pages 2867–2867, 2018

2018
[5]

Rappor: Ran- domized aggregatable privacy-preserving ordinal response

´Ulfar Erlingsson, Vasyl Pihur, and Aleksandra Korolova. Rappor: Ran- domized aggregatable privacy-preserving ordinal response. InProceedings of the 2014 ACM SIGSAC conference on computer and communications security, pages 1054–1067, 2014

2014
[6]

Deep learning with differential privacy

Martin Abadi, Andy Chu, Ian Goodfellow, H Brendan McMahan, Ilya Mironov, Kunal Talwar, and Li Zhang. Deep learning with differential privacy. InProceedings of the 2016 ACM SIGSAC conference on computer and communications security, pages 308–318, 2016. 30

2016
[7]

Gaussian differential privacy

Jinshuo Dong, Aaron Roth, and Weijie J Su. Gaussian differential privacy. Journal of the Royal Statistical Society Series B: Statistical Methodology, 84(1):3–37, 2022

2022
[8]

Learn- ing differentially private recurrent language models

H Brendan McMahan, Daniel Ramage, Kunal Talwar, and Li Zhang. Learn- ing differentially private recurrent language models. InProceedings of the 2018 International Conference on Learning Representations, 2018

2018
[9]

The cost of privacy: Optimal rates of convergence for parameter estimation with differential privacy.The Annals of Statistics, 49(5):2825–2850, 2021

T Tony Cai, Yichen Wang, and Linjun Zhang. The cost of privacy: Optimal rates of convergence for parameter estimation with differential privacy.The Annals of Statistics, 49(5):2825–2850, 2021

2021
[10]

Arnab Auddy, T Tony Cai, and Abhinav Chakraborty. Minimax and adap- tive transfer learning for nonparametric classification under distributed dif- ferential privacy constraints.Journal of the Royal Statistical Society Series B: Statistical Methodology, page qkaf070, 2025

2025
[11]

Optimal differentially pri- vate pca and estimation for spiked covariance matrices.arXiv preprint arXiv:2401.03820, 2024

T Tony Cai, Dong Xia, and Mengyue Zha. Optimal differentially pri- vate pca and estimation for spiked covariance matrices.arXiv preprint arXiv:2401.03820, 2024

work page arXiv 2024
[12]

Impensis Thurnisiorum, fratrum, 1713

Jakob Bernoulli.Ars coniectandi. Impensis Thurnisiorum, fratrum, 1713
[13]

The algorithmic foundations of differ- ential privacy.Foundations and trends®in theoretical computer science, 9(3–4):211–407, 2014

Cynthia Dwork, Aaron Roth, et al. The algorithmic foundations of differ- ential privacy.Foundations and trends®in theoretical computer science, 9(3–4):211–407, 2014

2014
[14]

Differential privacy and robust statistics

Cynthia Dwork and Jing Lei. Differential privacy and robust statistics. In Proceedings of the forty-first annual ACM symposium on Theory of com- puting, pages 371–380, 2009

2009
[15]

Propose, test, re- lease: Differentially private estimation with high probability.arXiv preprint arXiv:2002.08774, 2020

Victor-Emmanuel Brunel and Marco Avella-Medina. Propose, test, re- lease: Differentially private estimation with high probability.arXiv preprint arXiv:2002.08774, 2020

work page arXiv 2002
[16]

Differential privacy and robust statistics in high dimensions

Xiyang Liu, Weihao Kong, and Sewoong Oh. Differential privacy and robust statistics in high dimensions. InConference on Learning Theory, pages 1167–1246. PMLR, 2022

2022
[17]

Renyi differential privacy of propose-test-release and applications to private and robust machine learning.Advances in Neural Information Processing Systems, 35:38719–38732, 2022

Jiachen T Wang, Saeed Mahloujifar, Shouda Wang, Ruoxi Jia, and Prateek Mittal. Renyi differential privacy of propose-test-release and applications to private and robust machine learning.Advances in Neural Information Processing Systems, 35:38719–38732, 2022

2022
[18]

Stochastic gra- dient descent with differentially private updates

Shuang Song, Kamalika Chaudhuri, and Anand D Sarwate. Stochastic gra- dient descent with differentially private updates. In2013 IEEE global con- ference on signal and information processing, pages 245–248. IEEE, 2013

2013
[19]

What can we learn privately?SIAM Journal on Computing, 40(3):793–826, 2011

Shiva Prasad Kasiviswanathan, Homin K Lee, Kobbi Nissim, Sofya Raskhodnikova, and Adam Smith. What can we learn privately?SIAM Journal on Computing, 40(3):793–826, 2011. 31

2011
[20]

Learning with differ- ential privacy: Stability, learnability and the sufficiency and necessity of erm principle.Journal of Machine Learning Research, 17(183):1–40, 2016

Yu-Xiang Wang, Jing Lei, and Stephen E Fienberg. Learning with differ- ential privacy: Stability, learnability and the sufficiency and necessity of erm principle.Journal of Machine Learning Research, 17(183):1–40, 2016

2016
[21]

Differentially private naive bayes classification

Jaideep Vaidya, Basit Shafiq, Anirban Basu, and Yuan Hong. Differentially private naive bayes classification. In2013 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Tech- nologies (IAT), volume 1, pages 571–576. IEEE, 2013

2013
[22]

Differentially private naive bayes clas- sifier using smooth sensitivity.arXiv preprint arXiv:2003.13955, 2020

Farzad Zafarani and Chris Clifton. Differentially private naive bayes clas- sifier using smooth sensitivity.arXiv preprint arXiv:2003.13955, 2020

work page arXiv 2003
[23]

Introduction to the non-asymptotic analysis of random matrices

Roman Vershynin. Introduction to the non-asymptotic analysis of random matrices.arXiv preprint arXiv:1011.3027, 2010

work page Pith review arXiv 2010
[24]

Differentially private ordinary least squares

Or Sheffet. Differentially private ordinary least squares. InInternational Conference on Machine Learning, pages 3105–3114. PMLR, 2017

2017
[25]

Functional mechanism: Regression analysis under differential privacy.Proceedings of the VLDB Endowment, 5(11), 2012

Jun Zhang, Zhenjie Zhang, Xiaokui Xiao, Yin Yang, and Marianne Winslett. Functional mechanism: Regression analysis under differential privacy.Proceedings of the VLDB Endowment, 5(11), 2012

2012
[26]

Springer, 2006

Larry Wasserman.All of nonparametric statistics. Springer, 2006

2006
[27]

Nonparametric estimators

Alexandre B Tsybakov. Nonparametric estimators. InIntroduction to Nonparametric Estimation, pages 1–76. Springer, 2008

2008
[28]

Minimax estimation via wavelet shrinkage.The annals of Statistics, 26(3):879–921, 1998

David L Donoho and Iain M Johnstone. Minimax estimation via wavelet shrinkage.The annals of Statistics, 26(3):879–921, 1998

1998
[29]

Optimal feder- ated learning for nonparametric regression with heterogeneous distributed differential privacy constraints.arXiv preprint arXiv:2406.06755, 2024

T Tony Cai, Abhinav Chakraborty, and Lasse Vuursteen. Optimal feder- ated learning for nonparametric regression with heterogeneous distributed differential privacy constraints.arXiv preprint arXiv:2406.06755, 2024. 32

work page arXiv 2024