Recognition: no theorem link
Limits of Personalizing Differential Privacy Budgets
Pith reviewed 2026-05-14 18:12 UTC · model grok-4.3
The pith
For mean estimation, a simple thresholding operator on privacy budgets captures nearly all the utility gains of full personalization.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Personalized budgets come with major limitations, and for mean estimation the dominant factor is not full personalization but choosing the right effective privacy budget through a simple thresholding operator. Compared with this thresholding baseline, the gains from fully personalized mechanisms are limited to constant factors in mixed private and public datasets and in datasets with two levels of privacy requirements, with upper bounds established for arbitrary requirements.
What carries the argument
The thresholding operator that determines an effective uniform privacy budget by filtering the most demanding individual requirements.
If this is right
- In mixed public and private datasets, full personalization improves utility by at most a constant factor over thresholding.
- In datasets with two levels of privacy requirements, similar constant-factor bounds apply.
- For arbitrary privacy requirements, upper bounds limit the maximal gain from personalization.
- The utility is primarily determined by the choice of effective budget rather than per-user customization.
Where Pith is reading between the lines
- Practitioners may achieve most benefits of differential privacy with simpler uniform-budget mechanisms.
- This limitation might extend to other statistical queries beyond mean estimation.
- Future work could explore whether similar thresholding suffices in non-additive noise settings or different data distributions.
Load-bearing premise
The analysis assumes standard additive-noise mechanisms for mean estimation and specific distributions of privacy requirements such as mixed public-private or two-level cases.
What would settle it
A counterexample where a fully personalized mechanism achieves super-constant-factor improvement in mean estimation utility over the thresholding baseline for standard additive noise would falsify the bounds.
Figures
read the original abstract
A key technical difficulty in differential privacy is selecting a privacy budget that satisfies privacy requirements while maximizing utility. A natural and well-studied workaround is to use personalized privacy budgets, which may differ across agents. In this paper, we show that personalized budgets come with major limitations and that for mean estimation, the dominant factor is not full personalization, but rather choosing the right effective privacy budget. This can be achieved through a simple thresholding operator that we describe. Compared with this thresholding baseline, the gains obtained by fully personalized mechanisms are limited. In particular, we precisely quantify the constant-factor improvement in settings with mixed private and public datasets and in private datasets with two levels of privacy requirements. We also establish upper bounds and identify regimes of maximal gain for arbitrary privacy requirements.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that personalized differential privacy budgets have major limitations for mean estimation. It shows that a simple thresholding operator on the effective privacy budget matches or nearly matches the utility of fully personalized mechanisms under additive noise (Gaussian/Laplace), with only constant-factor gains from full personalization. This is quantified for mixed public-private datasets and two-level privacy requirements, with upper bounds and regimes of maximal gain identified for arbitrary requirements.
Significance. If the bounds hold, the result indicates that choosing an effective privacy budget via thresholding is the dominant factor for utility in mean estimation, rather than full personalization. This could simplify DP deployments in practice while providing concrete constant-factor comparisons that guide when personalization is worthwhile.
major comments (2)
- [§4 (Constant-factor improvements for mixed and two-level cases)] The central claim for mean estimation rests on additive-noise mechanisms and specific (two-level/mixed) privacy-requirement distributions. For arbitrary distributions with heavy tails or correlations between privacy requirements and data values, the noise-scale calculation and resulting utility gap may exceed the reported constant factors (see the reduction to effective epsilon via thresholding).
- [§5 (Upper bounds for arbitrary requirements)] The analysis does not address whether thresholding remains near-optimal once the mechanism is allowed to use data-dependent noise or public-data-assisted estimators; this is a load-bearing assumption for the generality of the upper bounds.
minor comments (1)
- [§3 (Thresholding baseline)] Clarify the exact definition and implementation of the thresholding operator in the main text, including how it interacts with the privacy requirement distribution.
Simulated Author's Rebuttal
We thank the referee for the constructive comments. Our paper focuses on additive-noise mechanisms for mean estimation and demonstrates that a simple thresholding operator on privacy budgets achieves performance within constant factors of fully personalized mechanisms. We address the major comments point by point below.
read point-by-point responses
-
Referee: [§4 (Constant-factor improvements for mixed and two-level cases)] The central claim for mean estimation rests on additive-noise mechanisms and specific (two-level/mixed) privacy-requirement distributions. For arbitrary distributions with heavy tails or correlations between privacy requirements and data values, the noise-scale calculation and resulting utility gap may exceed the reported constant factors (see the reduction to effective epsilon via thresholding).
Authors: We agree that the constant-factor results in §4 are derived for the mixed public-private and two-level cases under additive noise. For fully arbitrary distributions, including heavy-tailed privacy requirements or correlations with data values, the gap could be larger than the constants we report. Our upper bounds in §5 provide a general characterization of the maximal gain from personalization, but we will add a clarifying paragraph in the revised manuscript noting that the explicit constant-factor comparisons are specific to the analyzed distributions while the thresholding reduction to an effective epsilon remains valid more broadly. revision: partial
-
Referee: [§5 (Upper bounds for arbitrary requirements)] The analysis does not address whether thresholding remains near-optimal once the mechanism is allowed to use data-dependent noise or public-data-assisted estimators; this is a load-bearing assumption for the generality of the upper bounds.
Authors: The upper bounds in §5 are established specifically for additive-noise mechanisms (Gaussian and Laplace), which is the standard setting for mean estimation under differential privacy. Data-dependent noise allocation or public-data-assisted estimators fall outside this model and would require a separate analysis; our contribution is to show that, within the additive-noise regime, thresholding on the effective privacy budget is near-optimal up to constants. We will explicitly state this scope limitation in the revised introduction and conclusion to avoid overgeneralization. revision: partial
Circularity Check
No significant circularity; derivation self-contained from standard DP definitions
full rationale
The paper derives its central claims on the limits of personalized DP budgets for mean estimation directly from standard additive-noise mechanisms (Gaussian/Laplace) and explicit privacy-requirement distributions (mixed public-private or two-level). The thresholding operator is introduced as an explicit baseline construction, with constant-factor bounds and upper bounds obtained via direct analysis of noise scales and utility gaps; no step reduces a prediction to a fitted parameter by construction, invokes a self-citation as the sole load-bearing justification, or renames a known result. The derivation remains independent of the target results and relies on externally verifiable DP primitives.
Axiom & Free-Parameter Ledger
axioms (2)
- standard math Standard definition of epsilon-differential privacy
- domain assumption Mean estimation as the central query with additive noise mechanisms
Reference graph
Works this paper leans on
-
[1]
Personalized differential privacy for ridge regression under output perturbation.Naval Research Logistics (NRL), 73(4):525–537, 2026
Krishna Acharya, Franziska Boenisch, Rakshit Naidu, and Juba Ziani. Personalized differential privacy for ridge regression under output perturbation.Naval Research Logistics (NRL), 73(4):525–537, 2026
2026
-
[2]
Heterogeneous Differential Privacy
Mohamed Alaggan, Sébastien Gambs, and Anne-Marie Kermarrec. Heterogeneous differential privacy.arXiv preprint arXiv:1504.06998, 2015
work page internal anchor Pith review Pith/arXiv arXiv 2015
-
[3]
Anita Allen.Unpopular Privacy: What Must We Hide?OUP Usa, New York, US, 2011
2011
-
[4]
Limits of private learning with access to public data
Noga Alon, Raef Bassily, and Shay Moran. Limits of private learning with access to public data. Advances in neural information processing systems, 32, 2019
2019
-
[5]
Raef Bassily, Kate Donahue, Diptangshu Sen, Annuo Zhao, and Juba Ziani. Data sharing with endogenous choices over differential privacy levels.arXiv preprint arXiv:2602.09357, 2026
-
[6]
Private estimation with public data.Advances in neural information processing systems, 35:18653–18666, 2022
Alex Bie, Gautam Kamath, and Vikrant Singhal. Private estimation with public data.Advances in neural information processing systems, 35:18653–18666, 2022
2022
-
[7]
Oracle-efficient differentially private learning with public data.Advances in Neural Information Processing Systems, 37:113191–113233, 2024
Adam Block, Mark Bun, Rathin Desai, Abhishek Shetty, and Zhiwei S Wu. Oracle-efficient differentially private learning with public data.Advances in Neural Information Processing Systems, 37:113191–113233, 2024
2024
-
[8]
Have it your way: Individualized privacy assignment for dp-sgd.Advances in Neural Information Processing Systems, 36:19073–19103, 2023
Franziska Boenisch, Christopher Mühl, Adam Dziedzic, Roy Rinberg, and Nicolas Papernot. Have it your way: Individualized privacy assignment for dp-sgd.Advances in Neural Information Processing Systems, 36:19073–19103, 2023
2023
-
[9]
Franziska Boenisch, Christopher Mühl, Roy Rinberg, Jannis Ihrig, and Adam Dziedzic. Individ- ualized pate: Differentially private machine learning with individual privacy guarantees.arXiv preprint arXiv:2202.10517, 2022
-
[10]
Average-case averages: Private algorithms for smooth sensitivity and mean estimation.Advances in Neural Information Processing Systems, 32, 2019
Mark Bun and Thomas Steinke. Average-case averages: Private algorithms for smooth sensitivity and mean estimation.Advances in Neural Information Processing Systems, 32, 2019
2019
-
[11]
Courtade
Syomantak Chaudhuri and Thomas A. Courtade. Mean estimation under heterogeneous privacy: Some privacy can be free.2023 IEEE International Symposium on Information Theory (ISIT), pages 1639–1644, 2023
2023
-
[12]
Courtade
Syomantak Chaudhuri and Thomas A. Courtade. Managing correlations in data and privacy de- mand. InProceedings of the 2025 ACM SIGSAC Conference on Computer and Communications Security, CCS ’25, page 2384–2398. ACM, November 2025
2025
-
[13]
Courtade
Syomantak Chaudhuri, Konstantin Miagkov, and Thomas A. Courtade. Mean estimation under heterogeneous privacy demands.IEEE Transactions on Information Theory, 71(2):1362–1375, February 2025
2025
-
[14]
Individual sensitivity preprocessing for data privacy
Rachel Cummings and David Durfee. Individual sensitivity preprocessing for data privacy. In Proceedings of the Fourteenth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 528–547. SIAM, 2020
2020
-
[15]
Optimal data acquisition with privacy-aware agents
Rachel Cummings, Hadi Elzayn, Emmanouil Pountourakis, Vasilis Gkatzelis, and Juba Ziani. Optimal data acquisition with privacy-aware agents. In2023 IEEE Conference on Secure and Trustworthy Machine Learning (SaTML), pages 210–224. IEEE, 2023. 10
2023
-
[16]
Accuracy for sale: Aggregating data with a variance constraint
Rachel Cummings, Katrina Ligett, Aaron Roth, Zhiwei Steven Wu, and Juba Ziani. Accuracy for sale: Aggregating data with a variance constraint. InProceedings of the 2015 conference on innovations in theoretical computer science, pages 317–324, 2015
2015
-
[17]
Setting epsilon is not the issue in differential privacy
Edwige Cyffers. Setting epsilon is not the issue in differential privacy. InProceedings of the 39th International Conference on Neural Information Processing Systems, 2025
2025
-
[18]
Springer Berlin Heidelberg, 2006
Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam Smith.Calibrating Noise to Sensitivity in Private Data Analysis, page 265–284. Springer Berlin Heidelberg, 2006
2006
-
[19]
Optimal and differentially private data acquisition: Central and local mechanisms.Operations Research, 72(3):1105–1123, 2024
Alireza Fallah, Ali Makhdoumi, Azarakhsh Malekian, and Asuman Ozdaglar. Optimal and differentially private data acquisition: Central and local mechanisms.Operations Research, 72(3):1105–1123, 2024
2024
-
[20]
Individual privacy accounting via a renyi filter
Vitaly Feldman and Tijana Zrnic. Individual privacy accounting via a renyi filter. InAdvances in Neural Information Processing Systems, volume 33, 2020
2020
-
[21]
Explaining the privacy paradox: A systematic review of literature investigating privacy attitude and behavior.Computers and Security, 77:226–261, August 2018
Nina Gerber, Paul Gerber, and Melanie V olkamer. Explaining the privacy paradox: A systematic review of literature investigating privacy attitude and behavior.Computers and Security, 77:226–261, August 2018
2018
-
[22]
Privacy and coordination: Computing on databases with endogenous participation
Arpita Ghosh and Katrina Ligett. Privacy and coordination: Computing on databases with endogenous participation. InProceedings of the fourteenth ACM conference on Electronic commerce, pages 543–560, 2013
2013
-
[23]
Selling privacy at auction
Arpita Ghosh and Aaron Roth. Selling privacy at auction. InACM Conference on Electronic Commerce, pages 199–208, 2011
2011
-
[24]
Simple versus optimal mechanisms
Jason D Hartline and Tim Roughgarden. Simple versus optimal mechanisms. InProceedings of the 10th ACM conference on Electronic commerce, pages 225–234, 2009
2009
-
[25]
Conservative or liberal? personalized differential privacy
Zachary Jorgensen, Ting Yu, and Graham Cormode. Conservative or liberal? personalized differential privacy. InIEEE International Conference on Data Engineering (ICDE), pages 1023–1034, 2015
2015
-
[26]
Optimal differentially private model training with public data
Andrew Lowy, Zeman Li, Tianjian Huang, and Meisam Razaviyayn. Optimal differentially private model training with public data. InProceedings of the 41st International Conference on Machine Learning, ICML’24. JMLR.org, 2024
2024
-
[27]
Privacy as contextual integrity.Washington Law Review, 79, 05 2004
Helen Nissenbaum. Privacy as contextual integrity.Washington Law Review, 79, 05 2004
2004
-
[28]
Redrawing the boundaries on purchasing data from privacy-sensitive individuals
Kobbi Nissim, Salil Vadhan, and David Xiao. Redrawing the boundaries on purchasing data from privacy-sensitive individuals. InProceedings of the 5th conference on Innovations in theoretical computer science, pages 411–422, 2014
2014
-
[29]
Correlated noise mechanisms for differentially private learning, 2025
Krishna Pillutla, Jalaj Upadhyay, Christopher A. Choquette-Choo, Krishnamurthy Dj Dvijotham, Arun Ganesh, Monika Henzinger, Jonathan Katz, Ryan McKenna, H. B. McMahan, Keith Rush, Thomas Steinke, and Abhradeep Thakurta. Correlated noise mechanisms for differentially private learning.ArXiv, abs/2506.08201, 2025
-
[30]
Brendan McMahan, Sergei Vassilvitskii, Steve Chien, and Abhradeep Guha Thakurta
Natalia Ponomareva, Hussein Hazimeh, Alex Kurakin, Zheng Xu, Carson Denison, H. Brendan McMahan, Sergei Vassilvitskii, Steve Chien, and Abhradeep Guha Thakurta. How to dp-fy ml: A practical guide to machine learning with differential privacy.Journal of Artificial Intelligence Research, 77:1113–1201, 2023
2023
-
[31]
Public-data assisted private stochastic optimization: Power and limitations
Enayat Ullah, Michael Menart, Raef Bassily, Cristóbal Guzmán, and Raman Arora. Public-data assisted private stochastic optimization: Power and limitations. In A. Globerson, L. Mackey, D. Belgrave, A. Fan, U. Paquet, J. Tomczak, and C. Zhang, editors,Advances in Neural Information Processing Systems, volume 37, pages 20383–20427, 2024
2024
-
[32]
Differentially private learning with small public data.Proceedings of the AAAI Conference on Artificial Intelligence, 34(04):6219–6226, April 2020
Jun Wang and Zhi-Hua Zhou. Differentially private learning with small public data.Proceedings of the AAAI Conference on Artificial Intelligence, 34(04):6219–6226, April 2020. 11 A Preliminaries: Proof of Claim 1 Since the weights sum to1, we must have 1 = nX j=1 wj = nX j=1 ηmin{ε (j), τ}, hence η= 1/s τ . Therefore wj = min{ε(j), τ} sτ , and the resultin...
2020
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.