pith. machine review for the scientific record. sign in

arxiv: 2605.06505 · v1 · submitted 2026-05-07 · 💻 cs.LG · cs.AI· cs.CR

Recognition: unknown

PACZero: PAC-Private Fine-Tuning of Language Models via Sign Quantization

Authors on Pith no claims yet

Pith reviewed 2026-05-08 12:28 UTC · model grok-4.3

classification 💻 cs.LG cs.AIcs.CR
keywords PAC privacyzeroth-order optimizationsign quantizationlanguage model fine-tuningmutual informationmembership inference
0
0 comments X

The pith

PACZero achieves usable fine-tuning performance for large language models at zero mutual information privacy by using sign quantization to create unanimous gradient updates.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops PACZero mechanisms for private fine-tuning of language models with zeroth-order methods. It establishes that quantizing aggregated gradients to their signs produces many steps where all possible secret subsets agree on the update, allowing the release to carry no information about the secret data. A reader would care because this zero-information regime protects against membership inference attacks at the level of random guessing, something standard differential privacy cannot achieve without destroying utility. The resulting models maintain high accuracy on sentiment and question-answering tasks with models up to 6.7 billion parameters.

Core claim

PACZero uses sign quantization on subset-aggregated zeroth-order gradients to generate frequent unanimity steps at which the released sign is independent of which subset is the secret, enabling PACZero-ZPL to achieve I(S^*; Y_{1:T})=0. This yields 88.99% accuracy on SST-2 with OPT-1.3B full fine-tuning, close to the 91.1% non-private baseline, and competitive results on SQuAD while no prior private method works in the high-privacy regime.

What carries the argument

Sign quantization applied to zeroth-order gradients aggregated over candidate subsets, which induces unanimous update directions and zero conditional mutual information on those steps.

If this is right

  • Delivers accuracy within 2.1 percentage points of non-private zeroth-order fine-tuning at I=0.
  • Provides the first usable utility for language model fine-tuning under privacy stronger than differential privacy at epsilon less than 1.
  • Applies effectively to both parameter-efficient LoRA and full-parameter fine-tuning tracks.
  • Maintains nontrivial performance on SQuAD across model sizes at zero information leakage.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The unanimity mechanism could reduce privacy costs in other machine learning tasks where data subsets produce aligned gradient signs.
  • Future work might explore adaptive subset sizes to maximize the fraction of unanimous steps.
  • This privacy approach might combine with other techniques like differential privacy for even stronger guarantees when needed.

Load-bearing premise

The sign quantization step ensures that enough gradient updates are identical across all possible secret subsets so that the overall release reveals nothing about the secret.

What would settle it

Running the method on a dataset where subset gradients frequently disagree in sign, causing the coin-flip releases to reduce accuracy far below the reported levels or allowing an adversary to infer membership with probability above the prior.

Figures

Figures reproduced from arXiv: 2605.06505 by Marten van Dijk, Murat Bilgehan Ertan, Phuong Ha Nguyen, Srinivas Devadas, Xiaochen Zhu.

Figure 1
Figure 1. Figure 1: The PACZERO per-step mechanism. Per-sample ZO scalars are aggregated over M = 128 random subsets, sign-quantized to sm ∈ {−1, +1}, and released as a single bit identifying the sign of the secret subset. On unanimity (q + t ∈ {0, 1}) the released bit is constant on supp pt and contributes zero conditional MI. On disagreement, PACZERO-MI releasessign(sj ∗+N (0, σ2 t )) with σt calibrated to a per-step MI bud… view at source ↗
read the original abstract

We introduce PACZero, a family of PAC-private zeroth-order mechanisms for fine-tuning large language models that delivers usable utility at $I(S^*; Y_{1:T})=0$. This privacy regime bounds the membership-inference attack (MIA) posterior success rate at the prior, an MIA-resistance level the DP framework matches only at $\varepsilon=0$ and infinite noise. All DP-ZO comparisons below are matched at the MIA posterior level. The key insight is that PAC Privacy charges mutual information only when the release depends on which candidate subset is the secret. Sign-quantizing subset-aggregated zeroth-order gradients creates frequent unanimity, steps at which every candidate subset agrees on the update direction; at these steps the released sign costs zero conditional mutual information. We propose two variants that span the privacy-utility trade-off: PACZero-MI (budgeted MI via exact calibration on the binary release) and PACZero-ZPL ($I=0$ via a uniform coin flip on disagreement steps). We evaluate on SST-2 and SQuAD with OPT-1.3B and OPT-6.7B in both LoRA and full-parameter tracks. On SST-2 OPT-1.3B full fine-tuning at $I=0$, PACZero-ZPL reaches ${88.99\pm0.91}$, within $2.1$pp of the non-private MeZO baseline ($91.1$ FT). No prior method produces usable utility in the high-privacy regime $\varepsilon<1$, and PACZero-ZPL obtains competitive SST-2 accuracy and nontrivial SQuAD F1 across OPT-1.3B and OPT-6.7B at $I=0$.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper introduces PACZero, a family of PAC-private zeroth-order mechanisms for fine-tuning large language models. It claims that sign-quantizing subset-aggregated ZO gradients produces frequent unanimity steps where every candidate subset agrees on the update direction, allowing the released sign to incur zero conditional mutual information; on disagreements a data-independent coin flip is used to enforce I(S^*; Y_{1:T})=0. Two variants are proposed (PACZero-MI with budgeted MI and PACZero-ZPL with strict I=0), and empirical results are reported on SST-2 and SQuAD using OPT-1.3B and OPT-6.7B models in both LoRA and full fine-tuning, with PACZero-ZPL achieving 88.99±0.91 accuracy on SST-2 full fine-tuning at I=0 (within 2.1pp of the non-private MeZO baseline).

Significance. If the zero-MI claim and the underlying unanimity mechanism are rigorously validated, the result would be significant: it offers a route to membership-inference resistance at the prior level (matching DP only at ε=0 with infinite noise) while preserving usable utility for LLM fine-tuning, a regime where prior DP-ZO methods reportedly fail. The approach also supplies concrete accuracy numbers and a parameter-free privacy guarantee tied to an external MI definition.

major comments (3)
  1. [Experiments (SST-2 and SQuAD results)] The central utility claim at I=0 for PACZero-ZPL rests on the assumption that sign-quantized, subset-aggregated ZO gradients produce sufficiently frequent unanimity steps; on disagreement steps a coin flip is released. No measurement or bound on the unanimity frequency (or the fraction of steps using random flips) is reported in the experimental results, leaving open whether the observed 88.99 accuracy arises from the claimed mechanism or from task-specific factors. This is load-bearing for the privacy-utility trade-off.
  2. [Theoretical Analysis / Privacy Definition] The abstract states that PACZero-ZPL achieves I(S^*; Y_{1:T})=0 via coin flips on disagreements, yet the manuscript provides no explicit derivation or proof sketch showing that the conditional mutual information is exactly zero under the stated release rule. The MI bound is therefore not shown to reduce to the claimed value from the sign-quantization construction.
  3. [Experimental Protocol] Reported standard deviations (e.g., ±0.91 on SST-2) lack justification for the number of runs, seed selection, or error-bar protocol; without this, it is impossible to assess whether the 2.1pp gap to the MeZO baseline is statistically meaningful.
minor comments (2)
  1. [Method] Notation for the released sign Y_t and the candidate subsets should be introduced with explicit definitions before the unanimity argument is used.
  2. [Abstract] The comparison table or figure that matches all DP-ZO baselines at the same MIA posterior level should be referenced in the abstract for clarity.

Simulated Author's Rebuttal

3 responses · 0 unresolved

Thank you for the referee's insightful comments. We address each major point below and will revise the manuscript to incorporate clarifications and additional details as needed.

read point-by-point responses
  1. Referee: [Experiments (SST-2 and SQuAD results)] The central utility claim at I=0 for PACZero-ZPL rests on the assumption that sign-quantized, subset-aggregated ZO gradients produce sufficiently frequent unanimity steps; on disagreement steps a coin flip is released. No measurement or bound on the unanimity frequency (or the fraction of steps using random flips) is reported in the experimental results, leaving open whether the observed 88.99 accuracy arises from the claimed mechanism or from task-specific factors. This is load-bearing for the privacy-utility trade-off.

    Authors: We agree that quantifying the unanimity frequency is essential to substantiate the privacy-utility claims. In the revised manuscript, we will report the average and per-task fraction of unanimity steps observed during training for the SST-2 and SQuAD experiments. This additional data will clarify the contribution of the unanimity mechanism to the achieved accuracy. revision: yes

  2. Referee: [Theoretical Analysis / Privacy Definition] The abstract states that PACZero-ZPL achieves I(S^*; Y_{1:T})=0 via coin flips on disagreements, yet the manuscript provides no explicit derivation or proof sketch showing that the conditional mutual information is exactly zero under the stated release rule. The MI bound is therefore not shown to reduce to the claimed value from the sign-quantization construction.

    Authors: We thank the referee for highlighting this. The zero MI follows directly from the release mechanism: unanimity steps release a sign that is identical across all subsets (hence independent of S^*), while disagreement steps release an independent coin flip (also independent of S^*). We will include a concise proof sketch in the appendix of the revised version to formally derive that I(S^*; Y_{1:T}) = 0. revision: yes

  3. Referee: [Experimental Protocol] Reported standard deviations (e.g., ±0.91 on SST-2) lack justification for the number of runs, seed selection, or error-bar protocol; without this, it is impossible to assess whether the 2.1pp gap to the MeZO baseline is statistically meaningful.

    Authors: The standard deviations are based on 5 independent runs using distinct random seeds for model initialization and data ordering. We will revise the experimental details section to explicitly document the number of runs, seed selection, and how error bars are computed, enabling readers to evaluate the statistical significance of the results. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper's core mechanism defines PACZero-ZPL to release the sign of subset-aggregated ZO gradients on unanimous steps and a data-independent coin flip on disagreements, directly enforcing I(S*;Y)=0 by the definition of conditional mutual information. This construction does not reduce to a fitted parameter renamed as prediction, nor does it rely on a self-citation chain for its uniqueness or load-bearing privacy bound. The observation of 'frequent unanimity' is presented as an empirical property enabling utility rather than a self-definitional assumption, and the abstract ties the zero-MI claim to the external definition of mutual information without importing ansatzes or renaming known results via citation. The derivation remains self-contained.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Abstract-only review; the central claim rests on the PAC privacy definition and the assumption that unanimity steps occur frequently enough to dominate the information cost.

axioms (1)
  • domain assumption PAC privacy bounds membership-inference posterior success rate exactly at the prior when I(S*;Y)=0
    Stated directly in the abstract as the target privacy regime.

pith-pipeline@v0.9.0 · 5634 in / 1133 out tokens · 37418 ms · 2026-05-08T12:28:28.804139+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

51 extracted references · 26 canonical work pages · 2 internal anchors

  1. [1]

    Goodfellow, H

    Martín Abadi, Andy Chu, Ian J. Goodfellow, H. Brendan McMahan, Ilya Mironov, Kunal Talwar, and Li Zhang. Deep learning with differential privacy. In Edgar R. Weippl, Stefan Katzenbeisser, Christopher Kruegel, Andrew C. Myers, and Shai Halevi, editors,Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, Vienna, Austria, Oc...

  2. [2]

    Differential privacy has disparate impact on model accuracy

    Eugene Bagdasaryan, Omid Poursaeed, and Vitaly Shmatikov. Differential privacy has disparate impact on model accuracy. In Hanna M. Wallach, Hugo Larochelle, Alina Beygelzimer, Flo- rence d’Alché-Buc, Emily B. Fox, and Roman Garnett, editors,Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 201...

  3. [3]

    Bal, Dick H

    Henri E. Bal, Dick H. J. Epema, Cees de Laat, Rob van Nieuwpoort, John W. Romein, Frank J. Seinstra, Cees Snoek, and Harry A. G. Wijshoff. A medium-scale distributed system for computer science research: Infrastructure for the long term.Computer, 49(5):54–63, 2016. doi: 10.1109/MC.2016.127. URLhttps://doi.org/10.1109/MC.2016.127

  4. [4]

    JAX- Privacy: Algorithms for privacy-preserving machine learning in JAX, 2025

    Borja Balle, Leonard Berrada, Zachary Charles, Christopher A Choquette-Choo, Soham De, Vadym Doroshenko, Dj Dvijotham, Andrew Galen, Arun Ganesh, Sahra Ghalebikesabi, Jamie Hayes, Peter Kairouz, Ryan McKenna, Brendan McMahan, Aneesh Pappu, Natalia Ponomareva, Mikhail Pravilov, Keith Rush, Samuel L Smith, and Robert Stanforth. JAX- Privacy: Algorithms for ...

  5. [5]

    Unlocking the power of differentially private zeroth-order optimization for fine-tuning LLMs

    Ergute Bao, Yangfan Jiang, Fei Wei, Xiaokui Xiao, Zitao Li, Yaliang Li, and Bolin Ding. Unlocking the power of differentially private zeroth-order optimization for fine-tuning LLMs. In Lujo Bauer and Giancarlo Pellegrino, editors,34th USENIX Security Symposium, USENIX Security 2025, Seattle, WA, USA, August 13-15, 2025, pages 1569–1588. USENIX Association...

  6. [6]

    SIMD-PAC-DB: Pretty performant pac privacy

    Ilaria Battiston, Dandan Yuan, Xiaochen Zhu, and Peter Boncz. SIMD-PAC-DB: Pretty performant pac privacy. 2026. URLhttps://arxiv.org/abs/2603.15023. 10

  7. [7]

    SIGNSGD: compressed optimisation for non-convex problems

    Jeremy Bernstein, Yu-Xiang Wang, Kamyar Azizzadenesheli, and Animashree Anandkumar. SIGNSGD: compressed optimisation for non-convex problems. In Jennifer G. Dy and Andreas Krause, editors,Proceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholmsmässan, Stockholm, Sweden, July 10-15, 2018, Proceedings of Machine Learning R...

  8. [9]

    URLhttps://arxiv.org/abs/2105.07985

  9. [10]

    Differentially private optimization on large model at small cost

    Zhiqi Bu, Yu-Xiang Wang, Sheng Zha, and George Karypis. Differentially private optimization on large model at small cost. In Andreas Krause, Emma Brunskill, Kyunghyun Cho, Barbara Engelhardt, Sivan Sabato, and Jonathan Scarlett, editors,International Conference on Machine Learning, ICML 2023, 23-29 July 2023, Honolulu, Hawaii, USA, Proceedings of Machine ...

  10. [11]

    Automatic clipping: Dif- ferentially private deep learning made easier and stronger

    Zhiqi Bu, Yu-Xiang Wang, Sheng Zha, and George Karypis. Automatic clipping: Dif- ferentially private deep learning made easier and stronger. In Alice Oh, Tristan Nau- mann, Amir Globerson, Kate Saenko, Moritz Hardt, and Sergey Levine, editors,Ad- vances in Neural Information Processing Systems 36: Annual Conference on Neural In- formation Processing Syste...

  11. [12]

    Brown, Dawn Song, Úlfar Erlingsson, Alina Oprea, and Colin Raffel

    Nicholas Carlini, Florian Tramèr, Eric Wallace, Matthew Jagielski, Ariel Herbert-V oss, Kather- ine Lee, Adam Roberts, Tom B. Brown, Dawn Song, Úlfar Erlingsson, Alina Oprea, and Colin Raffel. Extracting training data from large language models. In Michael D. Bailey and Rachel Greenstadt, editors,30th USENIX Security Symposium, USENIX Security 2021, Augus...

  12. [13]

    Model Stealing Attacks Against Inductive Graph Neural Networks

    Nicholas Carlini, Steve Chien, Milad Nasr, Shuang Song, Andreas Terzis, and Florian Tramèr. Membership inference attacks from first principles. In43rd IEEE Symposium on Security and Privacy, SP 2022, San Francisco, CA, USA, May 22-26, 2022, pages 1897–1914. IEEE, 2022. doi: 10.1109/SP46214.2022.9833649. URL https://doi.org/10.1109/SP46214.2022. 9833649

  13. [14]

    arXiv preprint arXiv:2204.13650 , year=

    Soham De, Leonard Berrada, Jamie Hayes, Samuel L. Smith, and Borja Balle. Unlocking high-accuracy differentially private image classification through scale, 2022. URL https: //arxiv.org/abs/2204.13650

  14. [16]

    Individual privacy accounting via a rényi filter

    Vitaly Feldman and Tijana Zrnic. Individual privacy accounting via a rényi filter. In Marc’Aurelio Ranzato, Alina Beygelzimer, Yann N. Dauphin, Percy Liang, and Jennifer Wort- man Vaughan, editors,Advances in Neural Information Processing Systems 34: Annual Confer- ence on Neural Information Processing Systems 2021, NeurIPS 2021, December 6-14, 2021, vir-...

  15. [17]

    Exploring the limits of differentially private deep learning with group-wise clipping

    Jiyan He, Xuechen Li, Da Yu, Huishuai Zhang, Janardhan Kulkarni, Yin Tat Lee, Arturs Backurs, Nenghai Yu, and Jiang Bian. Exploring the limits of differentially private deep learning with group-wise clipping. InThe Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023. OpenReview.net, 2023. URL https://ope...

  16. [18]

    Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen

    Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen. LoRA: Low-rank adaptation of large language models. InThe Tenth 11 International Conference on Learning Representations, ICLR 2022, Virtual Event, April 25-29,

  17. [19]

    URLhttps://openreview.net/forum?id=nZeVKeeFYf9

    OpenReview.net, 2022. URLhttps://openreview.net/forum?id=nZeVKeeFYf9

  18. [20]

    The composition theorem for differential privacy

    Peter Kairouz, Sewoong Oh, and Pramod Viswanath. The composition theorem for differential privacy. In Francis R. Bach and David M. Blei, editors,Proceedings of the 32nd International Conference on Machine Learning, ICML 2015, Lille, France, 6-11 July 2015, volume 37 of JMLR Workshop and Conference Proceedings, pages 1376–1385. JMLR.org, 2015. URL http://p...

  19. [21]

    2014 , issue_date =

    Daniel Kifer and Ashwin Machanavajjhala. Pufferfish: A framework for mathematical privacy definitions.ACM Trans. Database Syst., 39(1):3:1–3:36, 2014. doi: 10.1145/2514689. URL https://doi.org/10.1145/2514689

  20. [22]

    Large language models can be strong differentially private learners

    Xuechen Li, Florian Tramèr, Percy Liang, and Tatsunori Hashimoto. Large language models can be strong differentially private learners. InThe Tenth International Conference on Learning Representations, ICLR 2022, Virtual Event, April 25-29, 2022. OpenReview.net, 2022. URL https://openreview.net/forum?id=bVuP3ltATMz

  21. [24]

    signsgd via zeroth-order oracle

    Sijia Liu, Pin-Yu Chen, Xiangyi Chen, and Mingyi Hong. signsgd via zeroth-order oracle. In7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019. OpenReview.net, 2019. URLhttps://openreview.net/forum?id= BJe-DsC5Fm

  22. [25]

    ACM Trans

    Xiao-Yang Liu, Rongyi Zhu, Daochen Zha, Jiechao Gao, Shan Zhong, Matt White, and Meikang Qiu. Differentially private low-rank adaptation of large language model using federated learning.ACM Trans. Manag. Inf. Syst., 16(2):1–24, 2025. doi: 10.1145/3682068. URL https://doi.org/10.1145/3682068

  23. [26]

    Lee, Danqi Chen, and Sanjeev Arora

    Sadhika Malladi, Tianyu Gao, Eshaan Nichani, Alex Damian, Jason D. Lee, Danqi Chen, and Sanjeev Arora. Fine-tuning language models with just forward passes. In Alice Oh, Tristan Naumann, Amir Globerson, Kate Saenko, Moritz Hardt, and Sergey Levine, edi- tors,Advances in Neural Information Processing Systems 36: Annual Conference on Neu- ral Information Pr...

  24. [27]

    Quantifying privacy risks of masked language models using membership inference attacks

    Fatemehsadat Mireshghallah, Kartik Goyal, Archit Uniyal, Taylor Berg-Kirkpatrick, and Reza Shokri. Quantifying privacy risks of masked language models using membership inference attacks. In Yoav Goldberg, Zornitsa Kozareva, and Yue Zhang, editors,Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022, Abu Dhabi,...

  25. [28]

    Rényi differential privacy

    Ilya Mironov. Rényi differential privacy. In30th IEEE Computer Security Foundations Symposium, CSF 2017, Santa Barbara, CA, USA, August 21-25, 2017, pages 263–275. IEEE Computer Society, 2017. doi: 10.1109/CSF.2017.11. URL https://doi.org/10.1109/CSF. 2017.11

  26. [29]

    Feder Cooper, Daphne Ippolito, Christopher A

    Milad Nasr, Javier Rando, Nicholas Carlini, Jonathan Hayase, Matthew Jagielski, A. Feder Cooper, Daphne Ippolito, Christopher A. Choquette-Choo, Florian Tramèr, and Katherine Lee. Scalable extraction of training data from aligned, production language models. InThe Thirteenth International Conference on Learning Representations, ICLR 2025, Singapore, April 24-28,

  27. [30]

    URLhttps://openreview.net/forum?id=vjel3nWP2a

    OpenReview.net, 2025. URLhttps://openreview.net/forum?id=vjel3nWP2a

  28. [31]

    Random gradient-free minimization of convex functions,

    Yurii E. Nesterov and Vladimir G. Spokoiny. Random gradient-free minimization of convex functions.Found. Comput. Math., 17(2):527–566, 2017. doi: 10.1007/S10208-015-9296-2. URLhttps://doi.org/10.1007/s10208-015-9296-2. 12

  29. [32]

    Smooth sensitivity and sampling in private data analysis

    Kobbi Nissim, Sofya Raskhodnikova, and Adam Smith. Smooth sensitivity and sampling in private data analysis. InProceedings of the Thirty-Ninth Annual ACM Symposium on Theory of Computing, STOC ’07, page 75–84, New York, NY , USA, 2007. Association for Computing Machinery. ISBN 9781595936318. doi: 10.1145/1250790.1250803. URL https: //doi.org/10.1145/12507...

  30. [33]

    Brendan McMahan, Sergei Vassilvitskii, Steve Chien, and Abhradeep Guha Thakurta

    Natalia Ponomareva, Hussein Hazimeh, Alex Kurakin, Zheng Xu, Carson Denison, H. Brendan McMahan, Sergei Vassilvitskii, Steve Chien, and Abhradeep Guha Thakurta. How to dp-fy ML: A practical guide to machine learning with differential privacy.J. Artif. Intell. Res., 77:1113– 1201, 2023. doi: 10.1613/JAIR.1.14649. URL https://doi.org/10.1613/jair.1.14649

  31. [34]

    SQuAD : 100,000+ questions for machine comprehension of text

    Pranav Rajpurkar, Jian Zhang, Konstantin Lopyrev, and Percy Liang. SQuAD: 100, 000+ questions for machine comprehension of text. In Jian Su, Xavier Carreras, and Kevin Duh, editors,Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, EMNLP 2016, Austin, Texas, USA, November 1-4, 2016, pages 2383–2392. The Association for...

  32. [35]

    Towards understanding the impact of model size on differential private classification.CoRR, abs/2111.13895:1–14, 2021

    Yinchen Shen, Zhiguo Wang, Ruoyu Sun, and Xiaojing Shen. Towards understanding the impact of model size on differential private classification.CoRR, abs/2111.13895:1–14, 2021. URLhttps://arxiv.org/abs/2111.13895

  33. [36]

    Membership Inference Attacks Against Machine Learning Models

    Reza Shokri, Marco Stronati, Congzheng Song, and Vitaly Shmatikov. Membership inference attacks against machine learning models. In2017 IEEE Symposium on Security and Privacy, SP 2017, San Jose, CA, USA, May 22-26, 2017, pages 3–18. IEEE Computer Society, 2017. doi: 10.1109/SP.2017.41. URLhttps://doi.org/10.1109/SP.2017.41

  34. [37]

    Manning, Andrew Y

    Richard Socher, Alex Perelygin, Jean Wu, Jason Chuang, Christopher D. Manning, Andrew Y . Ng, and Christopher Potts. Recursive deep models for semantic compositionality over a sen- timent treebank. InProceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, EMNLP 2013, 18-21 October 2013, Grand Hyatt Seattle, Seattle, Wash- i...

  35. [38]

    URL https: //doi.org/10.18653/v1/d13-1170

    doi: 10.18653/V1/D13-1170. URLhttps://doi.org/10.18653/v1/d13-1170

  36. [39]

    Prompt inversion attack against collaborative inference of large language models,

    Mayuri Sridhar, Hanshen Xiao, and Srinivas Devadas. Pac-private algorithms. In Marina Blan- ton, William Enck, and Cristina Nita-Rotaru, editors,IEEE Symposium on Security and Privacy, SP 2025, San Francisco, CA, USA, May 12-15, 2025, pages 3839–3857. IEEE, 2025. doi: 10.1109/SP61157.2025.00034. URLhttps://doi.org/10.1109/SP61157.2025.00034

  37. [40]

    Private fine-tuning of large language models with zeroth-order optimization.Trans

    Xinyu Tang, Ashwinee Panda, Milad Nasr, Saeed Mahloujifar, and Prateek Mittal. Private fine-tuning of large language models with zeroth-order optimization.Trans. Mach. Learn. Res., 2025, 2025. URLhttps://openreview.net/forum?id=3Y3o0yFZfu

  38. [41]

    Bayesian differential privacy for machine learning

    Aleksei Triastcyn and Boi Faltings. Bayesian differential privacy for machine learning. In Proceedings of the 37th International Conference on Machine Learning, ICML 2020, 13-18 July 2020, Virtual Event, Proceedings of Machine Learning Research, pages 9583–9592. PMLR,

  39. [42]

    URLhttp://proceedings.mlr.press/v119/triastcyn20a.html

  40. [43]

    Oseledets

    Nurislam Tursynbek, Aleksandr Petiushko, and Ivan V . Oseledets. Robustness threats of differential privacy.CoRR, abs/2012.07828:1–16, 2020. URL https://arxiv.org/abs/ 2012.07828

  41. [44]

    Per-instance differential privacy.J

    Yu-Xiang Wang. Per-instance differential privacy.J. Priv. Confidentiality, 9(1), 2019. doi: 10.29012/JPC.662. URLhttps://doi.org/10.29012/jpc.662

  42. [45]

    PAC Privacy: Automatic privacy measurement and control of data processing

    Hanshen Xiao and Srinivas Devadas. PAC Privacy: Automatic privacy measurement and control of data processing. In Helena Handschuh and Anna Lysyanskaya, editors,Advances in Cryptology - CRYPTO 2023 - 43rd Annual International Cryptology Conference, CRYPTO 2023, Santa Barbara, CA, USA, August 20-24, 2023, Proceedings, Part II, Lecture Notes in Computer Scie...

  43. [46]

    Edward Suh, and Srinivas Devadas

    Hanshen Xiao, G. Edward Suh, and Srinivas Devadas. Formal privacy proof of data encoding: The possibility and impossibility of learnable encryption. InProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security, CCS ’24, page 1834–1848, New York, NY , USA, 2024. Association for Computing Machinery. ISBN 9798400706363. doi: 10....

  44. [47]

    Privacy risk in machine learning: Analyzing the connection to overfitting

    Samuel Yeom, Irene Giacomelli, Matt Fredrikson, and Somesh Jha. Privacy risk in machine learning: Analyzing the connection to overfitting. In31st IEEE Computer Security Foundations Symposium, CSF 2018, Oxford, United Kingdom, July 9-12, 2018, pages 268–282. IEEE Computer Society, 2018. doi: 10.1109/CSF.2018.00027. URL https://doi.org/10.1109/ CSF.2018.00027

  45. [48]

    CoRR , volume =

    Ashkan Yousefpour, Igor Shilov, Alexandre Sablayrolles, Davide Testuggine, Karthik Prasad, Mani Malek, John Nguyen, Sayan Ghosh, Akash Bharadwaj, Jessica Zhao, Graham Cormode, and Ilya Mironov. Opacus: User-friendly differential privacy library in pytorch.CoRR, abs/2109.12298, 2021. URLhttps://arxiv.org/abs/2109.12298

  46. [49]

    Inan, Gautam Kamath, Janardhan Kulkarni, Yin Tat Lee, Andre Manoel, Lukas Wutschitz, Sergey Yekhanin, and Huishuai Zhang

    Da Yu, Saurabh Naik, Arturs Backurs, Sivakanth Gopi, Huseyin A. Inan, Gautam Kamath, Janardhan Kulkarni, Yin Tat Lee, Andre Manoel, Lukas Wutschitz, Sergey Yekhanin, and Huishuai Zhang. Differentially private fine-tuning of language models. InThe Tenth Interna- tional Conference on Learning Representations, ICLR 2022, Virtual Event, April 25-29, 2022. Ope...

  47. [50]

    Individual privacy accounting for differentially private stochastic gradient descent.Trans

    Da Yu, Gautam Kamath, Janardhan Kulkarni, Tie-Yan Liu, Jian Yin, and Huishuai Zhang. Individual privacy accounting for differentially private stochastic gradient descent.Trans. Mach. Learn. Res., 2023, 2023. URLhttps://openreview.net/forum?id=l4Jcxs0fpC

  48. [51]

    DPZero: Private fine-tuning of language models without backpropagation

    Liang Zhang, Bingcong Li, Kiran Koshy Thekumparampil, Sewoong Oh, and Niao He. DPZero: Private fine-tuning of language models without backpropagation. In Ruslan Salakhutdinov, Zico Kolter, Katherine A. Heller, Adrian Weller, Nuria Oliver, Jonathan Scarlett, and Felix Berkenkamp, editors,Forty-first International Conference on Machine Learning, ICML 2024, ...

  49. [52]

    OPT: Open Pre-trained Transformer Language Models

    Susan Zhang, Stephen Roller, Naman Goyal, Mikel Artetxe, Moya Chen, Shuohui Chen, Christopher Dewan, Mona T. Diab, Xian Li, Xi Victoria Lin, Todor Mihaylov, Myle Ott, Sam Shleifer, Kurt Shuster, Daniel Simig, Punit Singh Koura, Anjali Sridhar, Tianlu Wang, and Luke Zettlemoyer. OPT: open pre-trained transformer language models.CoRR, abs/2205.01068,

  50. [54]

    Differentially private SGD without clipping bias: An error-feedback approach

    Xinwei Zhang, Zhiqi Bu, Steven Wu, and Mingyi Hong. Differentially private SGD without clipping bias: An error-feedback approach. InThe Twelfth International Conference on Learning Representations, ICLR 2024, Vienna, Austria, May 7-11, 2024. OpenReview.net, 2024. URL https://openreview.net/forum?id=uFbWHyTlPn

  51. [55]

    Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks

    Xiaochen Zhu, Mayuri Sridhar, and Srinivas Devadas. Pac-private responses with adversarial composition.CoRR, abs/2601.14033, 2026. doi: 10.48550/ARXIV .2601.14033. URLhttps: //doi.org/10.48550/arXiv.2601.14033. A Appendix Organization Appendix is organized as follows. • Appendix B contains deferred full proofs of Lemma 2 (Appendix B.1) and Theorem 3 (Appe...