pith. machine review for the scientific record. sign in

arxiv: 2605.10200 · v1 · submitted 2026-05-11 · 💻 cs.DS

Recognition: no theorem link

Convex Optimization with Local Label Differential Privacy: Tight Bounds in All Privacy Regimes

Authors on Pith no claims yet

Pith reviewed 2026-05-12 03:22 UTC · model grok-4.3

classification 💻 cs.DS
keywords local differential privacystochastic convex optimizationlabel privacyexcess risk boundsnon-interactive algorithmsinformation-theoretic lower boundshigh-privacy regimemedium-privacy regime
0
0 comments X

The pith

A new non-interactive algorithm achieves square-root dependence on label space size for convex optimization under local label differential privacy, with matching lower bounds.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper studies stochastic convex optimization where each user randomizes their label locally to satisfy label differential privacy while features remain public. It introduces an efficient algorithm whose excess risk scales with the square root of the label space size K rather than linearly with K, in both the high-privacy regime where epsilon is at most 1 and the medium-privacy regime where epsilon is between 1 and the log of K. This improvement is paired with an information-theoretic lower bound that matches the upper bound for all sufficiently large sample sizes n. A reader would care because it shows the privacy overhead is not as prohibitive as earlier linear bounds suggested, opening the door to practical private learning on tasks with many classes.

Core claim

We present a novel and efficient non-interactive L-LDP algorithm that achieves an excess risk of O(sqrt(K/(epsilon n))) in the high-privacy regime (epsilon <= 1) and O(sqrt(K/(e^epsilon n))) in the medium-privacy regime (1 <= epsilon <= ln K). This quadratically improves the dependency on the label space size from O(K) to O(sqrt(K)). We also prove a matching information-theoretic lower bound across all privacy regimes for any sufficiently large n, resolving whether the prior linear cost was fundamental to the L-LDP model.

What carries the argument

The novel non-interactive L-LDP algorithm that produces randomized labels locally while enabling accurate convex optimization, together with the information-theoretic lower bound construction that establishes tightness.

If this is right

  • The linear dependence on K established in prior work is not fundamental to local label differential privacy.
  • Efficient non-interactive algorithms exist that reduce the label-space privacy cost from linear to square-root scaling.
  • The upper and lower bounds match across high-privacy and medium-privacy regimes once n is sufficiently large.
  • Convex optimization under this privacy model can now be performed at costs comparable to non-private rates up to the sqrt(K) factor.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same quadratic improvement may be achievable in related settings such as empirical risk minimization or when features are also private.
  • Datasets with hundreds of labels, such as fine-grained classification tasks, could become feasible under local label privacy without prohibitive accuracy loss.
  • It remains open whether interactive protocols or different randomization schemes could remove the remaining sqrt(K) factor entirely.

Load-bearing premise

The matching lower bound is stated to hold only for any sufficiently large n, and the analysis relies on standard convexity of the loss and the local label differential privacy model with public features.

What would settle it

A concrete counterexample showing either that no non-interactive L-LDP algorithm can beat linear dependence on K for large n, or that the proposed algorithm's measured excess risk scales linearly with K on a convex problem instance with growing n, would falsify the tightness claim.

read the original abstract

We study the problem of Stochastic Convex Optimization (SCO) under the constraint of local Label Differential Privacy (L-LDP). In this setting, the features are considered public, but the corresponding labels are sensitive and must be randomized by each user locally before being sent to an untrusted analyzer. Prior work for SCO under L-LDP (Ghazi et al., 2021) established an excess population risk bound with a \emph{linear} dependence on the size of the label space, $K$: $O\left({\frac{K}{\epsilon\sqrt{n}}}\right)$ in the high-privacy regime ($\epsilon \leq 1$) and $O\left({\frac{K}{e^{\epsilon} \sqrt{n}}}\right)$ in the medium-privacy regime ($1 \leq \epsilon \leq \ln K$). This left open whether this linear cost is fundamental to the L-LDP model. In this note, we resolve this question. First, we present a novel and efficient non-interactive L-LDP algorithm that achieves an excess risk of $O\left({\sqrt{\frac{K}{\epsilon n}}}\right)$ in the high-privacy regime ($\epsilon \leq 1$) and $O\left({\sqrt{\frac{K}{e^{\epsilon} n}}}\right)$ in the medium-privacy regime ($1 \leq \epsilon \leq \ln K$). This quadratically improves the dependency on the label space size from $O(K)$ to $O(\sqrt{K})$. Second, we prove a matching information-theoretic lower bound across all privacy regimes for any sufficiently large $n$.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 0 minor

Summary. The manuscript studies stochastic convex optimization under local label differential privacy (L-LDP), with public features and locally randomized labels. Prior work obtained excess-risk bounds linear in the label-space size K. The authors give a new non-interactive algorithm attaining O(sqrt(K/(ε n))) excess risk in the high-privacy regime (ε ≤ 1) and O(sqrt(K/(e^ε n))) in the medium-privacy regime (1 ≤ ε ≤ ln K), a quadratic improvement in K. They also prove a matching information-theoretic lower bound that holds for all sufficiently large n.

Significance. If the stated bounds and proofs hold, the result is significant: it supplies the first tight characterization of excess risk for L-LDP SCO across the high- and medium-privacy regimes, replacing an O(K) dependence with O(sqrt(K)) while retaining an efficient non-interactive algorithm. The combination of an explicit algorithmic construction and an independent matching lower bound constitutes a strong contribution to the literature on private convex optimization.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their thorough review and positive recommendation to accept the manuscript. We are glad that the referee finds the tight characterization of excess risk under L-LDP SCO, including the quadratic improvement in the dependence on the label space size K, to be a significant contribution.

Circularity Check

0 steps flagged

No significant circularity; new algorithm and independent lower bound

full rationale

The paper's central claims rest on a novel non-interactive algorithm construction achieving the stated O(sqrt(K)) improvement and a separately proven information-theoretic lower bound that holds for sufficiently large n. These steps are self-contained: the algorithm is explicitly constructed in the manuscript rather than fitted or renamed from prior outputs, and the lower bound is derived directly from the L-LDP model assumptions without reducing to a self-citation chain or definitional equivalence. The citation to Ghazi et al. (2021) merely sets the baseline linear bound being improved upon and does not load-bear the new quadratic result or the matching lower bound. No self-definitional, fitted-input, or ansatz-smuggling patterns appear in the derivation chain.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Based solely on the abstract, no explicit free parameters, ad-hoc axioms, or invented entities are identifiable; the work builds on standard assumptions of stochastic convex optimization and local differential privacy.

pith-pipeline@v0.9.0 · 5607 in / 1239 out tokens · 80088 ms · 2026-05-12T03:22:16.735995+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

15 extracted references · 15 canonical work pages

  1. [1]

    URL https://doi.org/10.1109/FOCS.2014.56

    doi: 10.1109/FOCS.2014.56. URL https://doi.org/10.1109/FOCS.2014.56. Sébastien Bubeck. Convex Optimization: Algorithms and Complexity.Found. Trends Mach. Learn., 8(3-4): 231–357,

  2. [2]

    URLhttps://doi.org/10.1109/FOCS.2013.53

    doi: 10.1109/FOCS.2013.53. URLhttps://doi.org/10.1109/FOCS.2013.53. John C. Duchi, Michael I. Jordan, and Martin J. Wainwright. Minimax optimal procedures for locally private estimation.J. ASA, 113(521):182–201,

  3. [3]

    M., KUCUKELBIR, A., MCAULIFFE, J

    doi: 10.1080/01621459.2017.1389735. URLhttps: //doi.org/10.1080/01621459.2017.1389735. Cynthia Dwork and Aaron Roth. The algorithmic foundations of differential privacy.Found. Trends Theor. Comput. Sci., 9(3-4):211–407,

  4. [4]

    The algorithmic foundations of differential privacy

    doi: 10.1561/0400000042. URLhttps://doi.org/10.1561/ 0400000042. 7 Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam D. Smith. Calibrating noise to sensitivity in private data analysis. InTCC, pp. 265–284,

  5. [5]

    URLhttps://doi.org/ 10.1007/11681878_14

    doi: 10.1007/11681878\_14. URLhttps://doi.org/ 10.1007/11681878_14. Úlfar Erlingsson, Vasyl Pihur, and Aleksandra Korolova. Rappor: Randomized aggregatable privacy- preserving ordinal response. InCCS, pp. 1054–1067,

  6. [6]

    RAPPOR: Randomized Aggregatable Privacy-Preserving Ordinal Response , year =

    doi: 10.1145/2660267.2660348. URLhttp: //dx.doi.org/10.1145/2660267.2660348. Vitaly Feldman and Kunal Talwar. Lossless compression of efficient private local randomizers. InICML, pp. 3208–3219,

  7. [7]

    LabelDP-Pro: learning with label differential privacy via projections

    Badih Ghazi, Yangsibo Huang, Pritish Kamath, Ravi Kumar, Pasin Manurangsi, and Chiyuan Zhang. LabelDP-Pro: learning with label differential privacy via projections. InICLR, 2024a. URL https: //openreview.net/forum?id=JnYaF3vv3G. Badih Ghazi, Pritish Kamath, Ravi Kumar, Pasin Manurangsi, Raghu Meka, and Chiyuan Zhang. On convex optimization with semi-sensi...

  8. [8]

    Frontiers in Astronomy and Space Sciences , keywords =

    doi: 10.1109/FOCS.2019.00015. URLhttps://doi.org/10.1109/ FOCS.2019.00015. Shiva Prasad Kasiviswanathan, Homin K. Lee, Kobbi Nissim, Sofya Raskhodnikova, and Adam D. Smith. What can we learn privately?SIAM J. Comput., 40(3):793–826,

  9. [9]

    Lee, Kobbi Nissim, Sofya Raskhod- nikova, and Adam Smith

    doi: 10.1137/090756090. URL https://doi.org/10.1137/090756090. Shai Shalev-Shwartz, Ohad Shamir, Nathan Srebro, and Karthik Sridharan. Stochastic convex optimization. InCOLT,

  10. [10]

    Xinyu Tang, Milad Nasr, Saeed Mahloujifar, Virat Shejwalkar, Liwei Song, Amir Houmansadr, and Prateek Mittal

    URLhttp://www.cs.mcgill.ca/%7Ecolt2009/papers/018.pdf#page=1. Xinyu Tang, Milad Nasr, Saeed Mahloujifar, Virat Shejwalkar, Liwei Song, Amir Houmansadr, and Prateek Mittal. Machine learning with differentially private labels: Mechanisms and frameworks.PETS, 2022(4):332– 350,

  11. [11]

    URLhttps://doi.org/10.56553/popets-2022-0112

    doi: 10.56553/POPETS-2022-0112. URLhttps://doi.org/10.56553/popets-2022-0112. Shaowei Wang, Liusheng Huang, Pengzhan Wang, Yiwen Nie, Hongli Xu, Wei Yang, Xiang-Yang Li, and Chunming Qiao. Mutual information optimally local private discrete distribution estimation.arXiv, 1607.08025,

  12. [12]

    URLhttps: //doi.org/10.1109/TIT.2018.2809790

    doi: 10.1109/TIT.2018.2809790. URLhttps: //doi.org/10.1109/TIT.2018.2809790. A Variant of the Algorithm viad-Subset Selection Mechanism In this section, we sketch an argument showing that we may replace the randomizer in Section 3 with the randomizer from (Ye & Barg, 2018; Wang et al.,

  13. [13]

    Throughout this section, we assumeε≤lnKfor simplicity

    and still achieve the same asymptotic excess risk guarantee, albeit via slightly more involved calculations. Throughout this section, we assumeε≤lnKfor simplicity. (Otherwise, we may use the analysis forε= lnKinstead.) Recall thed-Subset Selection(Ye & Barg, 2018; Wang et al., 2016), whered∈[K−1]is a parameter. We use this to construct theε-L-LDPlocal ran...

  14. [14]

    We note that the original paper (Duchi et al.,

    We briefly discuss anε-L-LDP algorithm based on a vector randomization algorithm of (Duchi et al., 2013), which is stated below. We note that the original paper (Duchi et al.,

  15. [15]

    Theorem 4((Duchi et al., 2013)).LetQ be a (publicly known) d-dimensional subspace of Rm and L>

    only proves the following for Q =Rd; however, it can be easily extend to anyd-dimensional subspaceQ by writing the input vectorv using the basis ofQand applying the original algorithm on the latter. Theorem 4((Duchi et al., 2013)).LetQ be a (publicly known) d-dimensional subspace of Rm and L>