arxiv: 2605.13503 · v1 · submitted 2026-05-13 · 💻 cs.CR · cs.LG

Recognition: no theorem link

Limits of Personalizing Differential Privacy Budgets

Edwige Cyffers , Juba Ziani

Authors on Pith no claims yet

Pith reviewed 2026-05-14 18:12 UTC · model grok-4.3

classification 💻 cs.CR cs.LG

keywords differential privacypersonalized privacy budgetsmean estimationthresholding operatorprivacy-utility trade-offlimitations of personalizationconstant-factor improvements

0 comments

The pith

For mean estimation, a simple thresholding operator on privacy budgets captures nearly all the utility gains of full personalization.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper demonstrates that personalized differential privacy budgets have significant limitations in practice. For the task of mean estimation, the main driver of utility is selecting the right overall privacy level rather than customizing it for every user. This selection can be accomplished with a basic thresholding operator that effectively ignores the strictest privacy demands. Full personalization provides only limited, constant-factor improvements over this baseline in settings with mixed public-private data or two-tier privacy requirements. The authors also derive upper bounds on the possible gains for arbitrary privacy preference distributions.

Core claim

Personalized budgets come with major limitations, and for mean estimation the dominant factor is not full personalization but choosing the right effective privacy budget through a simple thresholding operator. Compared with this thresholding baseline, the gains from fully personalized mechanisms are limited to constant factors in mixed private and public datasets and in datasets with two levels of privacy requirements, with upper bounds established for arbitrary requirements.

What carries the argument

The thresholding operator that determines an effective uniform privacy budget by filtering the most demanding individual requirements.

If this is right

In mixed public and private datasets, full personalization improves utility by at most a constant factor over thresholding.
In datasets with two levels of privacy requirements, similar constant-factor bounds apply.
For arbitrary privacy requirements, upper bounds limit the maximal gain from personalization.
The utility is primarily determined by the choice of effective budget rather than per-user customization.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Practitioners may achieve most benefits of differential privacy with simpler uniform-budget mechanisms.
This limitation might extend to other statistical queries beyond mean estimation.
Future work could explore whether similar thresholding suffices in non-additive noise settings or different data distributions.

Load-bearing premise

The analysis assumes standard additive-noise mechanisms for mean estimation and specific distributions of privacy requirements such as mixed public-private or two-level cases.

What would settle it

A counterexample where a fully personalized mechanism achieves super-constant-factor improvement in mean estimation utility over the thresholding baseline for standard additive noise would falsify the bounds.

Figures

Figures reproduced from arXiv: 2605.13503 by Edwige Cyffers, Juba Ziani.

**Figure 2.** Figure 2: Ratio between the unique-threshold estimator and the best affine operator for two finite [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗

read the original abstract

A key technical difficulty in differential privacy is selecting a privacy budget that satisfies privacy requirements while maximizing utility. A natural and well-studied workaround is to use personalized privacy budgets, which may differ across agents. In this paper, we show that personalized budgets come with major limitations and that for mean estimation, the dominant factor is not full personalization, but rather choosing the right effective privacy budget. This can be achieved through a simple thresholding operator that we describe. Compared with this thresholding baseline, the gains obtained by fully personalized mechanisms are limited. In particular, we precisely quantify the constant-factor improvement in settings with mixed private and public datasets and in private datasets with two levels of privacy requirements. We also establish upper bounds and identify regimes of maximal gain for arbitrary privacy requirements.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Full personalization of DP budgets offers only constant-factor gains over simple thresholding for mean estimation.

read the letter

The main thing to know is that this paper puts some hard numbers on the limits of personalizing differential privacy budgets for mean estimation. They find that a simple thresholding operator on the effective privacy budget performs nearly as well as fully personalized mechanisms, and they bound the constant-factor improvements you can get in specific settings. What the paper does well is derive those bounds for mixed public-private datasets and for two-level privacy requirements. They also give upper bounds for arbitrary privacy requirements and point out the regimes where personalization helps the most. This gives a clearer picture than just saying personalization is better in theory. The analysis is grounded in standard additive noise mechanisms, which keeps things concrete. The comparisons to the thresholding baseline are direct and useful for understanding when full personalization is worth the extra complexity. On the soft spots, the results depend on the noise being additive and on the privacy requirements following the distributions they consider. If the requirements come from a heavy-tailed continuous distribution or are correlated with the actual data points, the reported constant factors might not capture the full gap. The paper also does not look at mechanisms that use data-dependent noise or public data in more sophisticated ways, which could change the picture. There is no sign of circular reasoning; the bounds come from the DP definitions and the mean estimation setup. This work is for people in the differential privacy community who are thinking about how to allocate budgets in practice. A reader focused on mean estimation or simple queries will get concrete guidance from the thresholding idea and the limits. It is worth sending to a serious referee because the claims are specific, the setup is standard, and the new bounds can be checked against the proofs. I recommend peer review. The paper clarifies an important practical point without overclaiming, so referees can assess the derivations and suggest extensions if needed.

Referee Report

2 major / 1 minor

Summary. The paper claims that personalized differential privacy budgets have major limitations for mean estimation. It shows that a simple thresholding operator on the effective privacy budget matches or nearly matches the utility of fully personalized mechanisms under additive noise (Gaussian/Laplace), with only constant-factor gains from full personalization. This is quantified for mixed public-private datasets and two-level privacy requirements, with upper bounds and regimes of maximal gain identified for arbitrary requirements.

Significance. If the bounds hold, the result indicates that choosing an effective privacy budget via thresholding is the dominant factor for utility in mean estimation, rather than full personalization. This could simplify DP deployments in practice while providing concrete constant-factor comparisons that guide when personalization is worthwhile.

major comments (2)

[§4 (Constant-factor improvements for mixed and two-level cases)] The central claim for mean estimation rests on additive-noise mechanisms and specific (two-level/mixed) privacy-requirement distributions. For arbitrary distributions with heavy tails or correlations between privacy requirements and data values, the noise-scale calculation and resulting utility gap may exceed the reported constant factors (see the reduction to effective epsilon via thresholding).
[§5 (Upper bounds for arbitrary requirements)] The analysis does not address whether thresholding remains near-optimal once the mechanism is allowed to use data-dependent noise or public-data-assisted estimators; this is a load-bearing assumption for the generality of the upper bounds.

minor comments (1)

[§3 (Thresholding baseline)] Clarify the exact definition and implementation of the thresholding operator in the main text, including how it interacts with the privacy requirement distribution.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments. Our paper focuses on additive-noise mechanisms for mean estimation and demonstrates that a simple thresholding operator on privacy budgets achieves performance within constant factors of fully personalized mechanisms. We address the major comments point by point below.

read point-by-point responses

Referee: [§4 (Constant-factor improvements for mixed and two-level cases)] The central claim for mean estimation rests on additive-noise mechanisms and specific (two-level/mixed) privacy-requirement distributions. For arbitrary distributions with heavy tails or correlations between privacy requirements and data values, the noise-scale calculation and resulting utility gap may exceed the reported constant factors (see the reduction to effective epsilon via thresholding).

Authors: We agree that the constant-factor results in §4 are derived for the mixed public-private and two-level cases under additive noise. For fully arbitrary distributions, including heavy-tailed privacy requirements or correlations with data values, the gap could be larger than the constants we report. Our upper bounds in §5 provide a general characterization of the maximal gain from personalization, but we will add a clarifying paragraph in the revised manuscript noting that the explicit constant-factor comparisons are specific to the analyzed distributions while the thresholding reduction to an effective epsilon remains valid more broadly. revision: partial
Referee: [§5 (Upper bounds for arbitrary requirements)] The analysis does not address whether thresholding remains near-optimal once the mechanism is allowed to use data-dependent noise or public-data-assisted estimators; this is a load-bearing assumption for the generality of the upper bounds.

Authors: The upper bounds in §5 are established specifically for additive-noise mechanisms (Gaussian and Laplace), which is the standard setting for mean estimation under differential privacy. Data-dependent noise allocation or public-data-assisted estimators fall outside this model and would require a separate analysis; our contribution is to show that, within the additive-noise regime, thresholding on the effective privacy budget is near-optimal up to constants. We will explicitly state this scope limitation in the revised introduction and conclusion to avoid overgeneralization. revision: partial

Circularity Check

0 steps flagged

No significant circularity; derivation self-contained from standard DP definitions

full rationale

The paper derives its central claims on the limits of personalized DP budgets for mean estimation directly from standard additive-noise mechanisms (Gaussian/Laplace) and explicit privacy-requirement distributions (mixed public-private or two-level). The thresholding operator is introduced as an explicit baseline construction, with constant-factor bounds and upper bounds obtained via direct analysis of noise scales and utility gaps; no step reduces a prediction to a fitted parameter by construction, invokes a self-citation as the sole load-bearing justification, or renames a known result. The derivation remains independent of the target results and relies on externally verifiable DP primitives.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The paper relies on standard differential privacy definitions and the mean estimation query without introducing new entities or many free parameters; analysis uses established noise-addition mechanisms.

axioms (2)

standard math Standard definition of epsilon-differential privacy
Invoked as the foundation for all privacy guarantees and utility comparisons.
domain assumption Mean estimation as the central query with additive noise mechanisms
The entire analysis and thresholding operator are developed specifically for this setting.

pith-pipeline@v0.9.0 · 5418 in / 1273 out tokens · 50667 ms · 2026-05-14T18:12:18.227737+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

32 extracted references · 4 canonical work pages · 1 internal anchor

[1]

Personalized differential privacy for ridge regression under output perturbation.Naval Research Logistics (NRL), 73(4):525–537, 2026

Krishna Acharya, Franziska Boenisch, Rakshit Naidu, and Juba Ziani. Personalized differential privacy for ridge regression under output perturbation.Naval Research Logistics (NRL), 73(4):525–537, 2026

2026
[2]

Heterogeneous Differential Privacy

Mohamed Alaggan, Sébastien Gambs, and Anne-Marie Kermarrec. Heterogeneous differential privacy.arXiv preprint arXiv:1504.06998, 2015

work page internal anchor Pith review Pith/arXiv arXiv 2015
[3]

Anita Allen.Unpopular Privacy: What Must We Hide?OUP Usa, New York, US, 2011

2011
[4]

Limits of private learning with access to public data

Noga Alon, Raef Bassily, and Shay Moran. Limits of private learning with access to public data. Advances in neural information processing systems, 32, 2019

2019
[5]

Data sharing with endogenous choices over differential privacy levels.arXiv preprint arXiv:2602.09357, 2026

Raef Bassily, Kate Donahue, Diptangshu Sen, Annuo Zhao, and Juba Ziani. Data sharing with endogenous choices over differential privacy levels.arXiv preprint arXiv:2602.09357, 2026

work page arXiv 2026
[6]

Private estimation with public data.Advances in neural information processing systems, 35:18653–18666, 2022

Alex Bie, Gautam Kamath, and Vikrant Singhal. Private estimation with public data.Advances in neural information processing systems, 35:18653–18666, 2022

2022
[7]

Oracle-efficient differentially private learning with public data.Advances in Neural Information Processing Systems, 37:113191–113233, 2024

Adam Block, Mark Bun, Rathin Desai, Abhishek Shetty, and Zhiwei S Wu. Oracle-efficient differentially private learning with public data.Advances in Neural Information Processing Systems, 37:113191–113233, 2024

2024
[8]

Have it your way: Individualized privacy assignment for dp-sgd.Advances in Neural Information Processing Systems, 36:19073–19103, 2023

Franziska Boenisch, Christopher Mühl, Adam Dziedzic, Roy Rinberg, and Nicolas Papernot. Have it your way: Individualized privacy assignment for dp-sgd.Advances in Neural Information Processing Systems, 36:19073–19103, 2023

2023
[9]

Individ- ualized pate: Differentially private machine learning with individual privacy guarantees.arXiv preprint arXiv:2202.10517, 2022

Franziska Boenisch, Christopher Mühl, Roy Rinberg, Jannis Ihrig, and Adam Dziedzic. Individ- ualized pate: Differentially private machine learning with individual privacy guarantees.arXiv preprint arXiv:2202.10517, 2022

work page arXiv 2022
[10]

Average-case averages: Private algorithms for smooth sensitivity and mean estimation.Advances in Neural Information Processing Systems, 32, 2019

Mark Bun and Thomas Steinke. Average-case averages: Private algorithms for smooth sensitivity and mean estimation.Advances in Neural Information Processing Systems, 32, 2019

2019
[11]

Courtade

Syomantak Chaudhuri and Thomas A. Courtade. Mean estimation under heterogeneous privacy: Some privacy can be free.2023 IEEE International Symposium on Information Theory (ISIT), pages 1639–1644, 2023

2023
[12]

Courtade

Syomantak Chaudhuri and Thomas A. Courtade. Managing correlations in data and privacy de- mand. InProceedings of the 2025 ACM SIGSAC Conference on Computer and Communications Security, CCS ’25, page 2384–2398. ACM, November 2025

2025
[13]

Courtade

Syomantak Chaudhuri, Konstantin Miagkov, and Thomas A. Courtade. Mean estimation under heterogeneous privacy demands.IEEE Transactions on Information Theory, 71(2):1362–1375, February 2025

2025
[14]

Individual sensitivity preprocessing for data privacy

Rachel Cummings and David Durfee. Individual sensitivity preprocessing for data privacy. In Proceedings of the Fourteenth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 528–547. SIAM, 2020

2020
[15]

Optimal data acquisition with privacy-aware agents

Rachel Cummings, Hadi Elzayn, Emmanouil Pountourakis, Vasilis Gkatzelis, and Juba Ziani. Optimal data acquisition with privacy-aware agents. In2023 IEEE Conference on Secure and Trustworthy Machine Learning (SaTML), pages 210–224. IEEE, 2023. 10

2023
[16]

Accuracy for sale: Aggregating data with a variance constraint

Rachel Cummings, Katrina Ligett, Aaron Roth, Zhiwei Steven Wu, and Juba Ziani. Accuracy for sale: Aggregating data with a variance constraint. InProceedings of the 2015 conference on innovations in theoretical computer science, pages 317–324, 2015

2015
[17]

Setting epsilon is not the issue in differential privacy

Edwige Cyffers. Setting epsilon is not the issue in differential privacy. InProceedings of the 39th International Conference on Neural Information Processing Systems, 2025

2025
[18]

Springer Berlin Heidelberg, 2006

Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam Smith.Calibrating Noise to Sensitivity in Private Data Analysis, page 265–284. Springer Berlin Heidelberg, 2006

2006
[19]

Optimal and differentially private data acquisition: Central and local mechanisms.Operations Research, 72(3):1105–1123, 2024

Alireza Fallah, Ali Makhdoumi, Azarakhsh Malekian, and Asuman Ozdaglar. Optimal and differentially private data acquisition: Central and local mechanisms.Operations Research, 72(3):1105–1123, 2024

2024
[20]

Individual privacy accounting via a renyi filter

Vitaly Feldman and Tijana Zrnic. Individual privacy accounting via a renyi filter. InAdvances in Neural Information Processing Systems, volume 33, 2020

2020
[21]

Explaining the privacy paradox: A systematic review of literature investigating privacy attitude and behavior.Computers and Security, 77:226–261, August 2018

Nina Gerber, Paul Gerber, and Melanie V olkamer. Explaining the privacy paradox: A systematic review of literature investigating privacy attitude and behavior.Computers and Security, 77:226–261, August 2018

2018
[22]

Privacy and coordination: Computing on databases with endogenous participation

Arpita Ghosh and Katrina Ligett. Privacy and coordination: Computing on databases with endogenous participation. InProceedings of the fourteenth ACM conference on Electronic commerce, pages 543–560, 2013

2013
[23]

Selling privacy at auction

Arpita Ghosh and Aaron Roth. Selling privacy at auction. InACM Conference on Electronic Commerce, pages 199–208, 2011

2011
[24]

Simple versus optimal mechanisms

Jason D Hartline and Tim Roughgarden. Simple versus optimal mechanisms. InProceedings of the 10th ACM conference on Electronic commerce, pages 225–234, 2009

2009
[25]

Conservative or liberal? personalized differential privacy

Zachary Jorgensen, Ting Yu, and Graham Cormode. Conservative or liberal? personalized differential privacy. InIEEE International Conference on Data Engineering (ICDE), pages 1023–1034, 2015

2015
[26]

Optimal differentially private model training with public data

Andrew Lowy, Zeman Li, Tianjian Huang, and Meisam Razaviyayn. Optimal differentially private model training with public data. InProceedings of the 41st International Conference on Machine Learning, ICML’24. JMLR.org, 2024

2024
[27]

Privacy as contextual integrity.Washington Law Review, 79, 05 2004

Helen Nissenbaum. Privacy as contextual integrity.Washington Law Review, 79, 05 2004

2004
[28]

Redrawing the boundaries on purchasing data from privacy-sensitive individuals

Kobbi Nissim, Salil Vadhan, and David Xiao. Redrawing the boundaries on purchasing data from privacy-sensitive individuals. InProceedings of the 5th conference on Innovations in theoretical computer science, pages 411–422, 2014

2014
[29]

Correlated noise mechanisms for differentially private learning, 2025

Krishna Pillutla, Jalaj Upadhyay, Christopher A. Choquette-Choo, Krishnamurthy Dj Dvijotham, Arun Ganesh, Monika Henzinger, Jonathan Katz, Ryan McKenna, H. B. McMahan, Keith Rush, Thomas Steinke, and Abhradeep Thakurta. Correlated noise mechanisms for differentially private learning.ArXiv, abs/2506.08201, 2025

work page arXiv 2025
[30]

Brendan McMahan, Sergei Vassilvitskii, Steve Chien, and Abhradeep Guha Thakurta

Natalia Ponomareva, Hussein Hazimeh, Alex Kurakin, Zheng Xu, Carson Denison, H. Brendan McMahan, Sergei Vassilvitskii, Steve Chien, and Abhradeep Guha Thakurta. How to dp-fy ml: A practical guide to machine learning with differential privacy.Journal of Artificial Intelligence Research, 77:1113–1201, 2023

2023
[31]

Public-data assisted private stochastic optimization: Power and limitations

Enayat Ullah, Michael Menart, Raef Bassily, Cristóbal Guzmán, and Raman Arora. Public-data assisted private stochastic optimization: Power and limitations. In A. Globerson, L. Mackey, D. Belgrave, A. Fan, U. Paquet, J. Tomczak, and C. Zhang, editors,Advances in Neural Information Processing Systems, volume 37, pages 20383–20427, 2024

2024
[32]

Differentially private learning with small public data.Proceedings of the AAAI Conference on Artificial Intelligence, 34(04):6219–6226, April 2020

Jun Wang and Zhi-Hua Zhou. Differentially private learning with small public data.Proceedings of the AAAI Conference on Artificial Intelligence, 34(04):6219–6226, April 2020. 11 A Preliminaries: Proof of Claim 1 Since the weights sum to1, we must have 1 = nX j=1 wj = nX j=1 ηmin{ε (j), τ}, hence η= 1/s τ . Therefore wj = min{ε(j), τ} sτ , and the resultin...

2020