arxiv: 2605.10200 · v1 · submitted 2026-05-11 · 💻 cs.DS

Recognition: no theorem link

Convex Optimization with Local Label Differential Privacy: Tight Bounds in All Privacy Regimes

Lynn Chua , Badih Ghazi , Ravi Kumar , Pasin Manurangsi , Ziteng Sun , Chiyuan Zhang

Authors on Pith no claims yet

Pith reviewed 2026-05-12 03:22 UTC · model grok-4.3

classification 💻 cs.DS

keywords local differential privacystochastic convex optimizationlabel privacyexcess risk boundsnon-interactive algorithmsinformation-theoretic lower boundshigh-privacy regimemedium-privacy regime

0 comments

The pith

A new non-interactive algorithm achieves square-root dependence on label space size for convex optimization under local label differential privacy, with matching lower bounds.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper studies stochastic convex optimization where each user randomizes their label locally to satisfy label differential privacy while features remain public. It introduces an efficient algorithm whose excess risk scales with the square root of the label space size K rather than linearly with K, in both the high-privacy regime where epsilon is at most 1 and the medium-privacy regime where epsilon is between 1 and the log of K. This improvement is paired with an information-theoretic lower bound that matches the upper bound for all sufficiently large sample sizes n. A reader would care because it shows the privacy overhead is not as prohibitive as earlier linear bounds suggested, opening the door to practical private learning on tasks with many classes.

Core claim

We present a novel and efficient non-interactive L-LDP algorithm that achieves an excess risk of O(sqrt(K/(epsilon n))) in the high-privacy regime (epsilon <= 1) and O(sqrt(K/(e^epsilon n))) in the medium-privacy regime (1 <= epsilon <= ln K). This quadratically improves the dependency on the label space size from O(K) to O(sqrt(K)). We also prove a matching information-theoretic lower bound across all privacy regimes for any sufficiently large n, resolving whether the prior linear cost was fundamental to the L-LDP model.

What carries the argument

The novel non-interactive L-LDP algorithm that produces randomized labels locally while enabling accurate convex optimization, together with the information-theoretic lower bound construction that establishes tightness.

If this is right

The linear dependence on K established in prior work is not fundamental to local label differential privacy.
Efficient non-interactive algorithms exist that reduce the label-space privacy cost from linear to square-root scaling.
The upper and lower bounds match across high-privacy and medium-privacy regimes once n is sufficiently large.
Convex optimization under this privacy model can now be performed at costs comparable to non-private rates up to the sqrt(K) factor.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same quadratic improvement may be achievable in related settings such as empirical risk minimization or when features are also private.
Datasets with hundreds of labels, such as fine-grained classification tasks, could become feasible under local label privacy without prohibitive accuracy loss.
It remains open whether interactive protocols or different randomization schemes could remove the remaining sqrt(K) factor entirely.

Load-bearing premise

The matching lower bound is stated to hold only for any sufficiently large n, and the analysis relies on standard convexity of the loss and the local label differential privacy model with public features.

What would settle it

A concrete counterexample showing either that no non-interactive L-LDP algorithm can beat linear dependence on K for large n, or that the proposed algorithm's measured excess risk scales linearly with K on a convex problem instance with growing n, would falsify the tightness claim.

read the original abstract

We study the problem of Stochastic Convex Optimization (SCO) under the constraint of local Label Differential Privacy (L-LDP). In this setting, the features are considered public, but the corresponding labels are sensitive and must be randomized by each user locally before being sent to an untrusted analyzer. Prior work for SCO under L-LDP (Ghazi et al., 2021) established an excess population risk bound with a \emph{linear} dependence on the size of the label space, $K$: $O\left({\frac{K}{\epsilon\sqrt{n}}}\right)$ in the high-privacy regime ($\epsilon \leq 1$) and $O\left({\frac{K}{e^{\epsilon} \sqrt{n}}}\right)$ in the medium-privacy regime ($1 \leq \epsilon \leq \ln K$). This left open whether this linear cost is fundamental to the L-LDP model. In this note, we resolve this question. First, we present a novel and efficient non-interactive L-LDP algorithm that achieves an excess risk of $O\left({\sqrt{\frac{K}{\epsilon n}}}\right)$ in the high-privacy regime ($\epsilon \leq 1$) and $O\left({\sqrt{\frac{K}{e^{\epsilon} n}}}\right)$ in the medium-privacy regime ($1 \leq \epsilon \leq \ln K$). This quadratically improves the dependency on the label space size from $O(K)$ to $O(\sqrt{K})$. Second, we prove a matching information-theoretic lower bound across all privacy regimes for any sufficiently large $n$.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper shows the linear dependence on label space size K from prior L-LDP work is not fundamental, replacing it with a sqrt(K) upper bound via a new non-interactive algorithm plus matching lower bounds across privacy regimes.

read the letter

The central point is that prior linear dependence on K in local label DP for stochastic convex optimization is not fundamental. This note gives a quadratic improvement to sqrt(K) dependence with both a new algorithm and matching information-theoretic lower bounds that hold for large n in high- and medium-privacy regimes. That directly answers the open question left by Ghazi et al. 2021. The algorithm is non-interactive, which aligns with the local model where only labels are privatized and features are public. The bounds are O(sqrt(K/(eps n))) for eps <=1 and O(sqrt(K/(e^eps n))) for 1<=eps<=ln K, with the lower bound matching in all regimes. This is a clean technical step forward. The work is strongest in delivering both the improved construction and the tight lower bound without circular arguments or heavy self-citation. Standard convexity assumptions and the large-n qualifier are typical for these asymptotic SCO results and do not undermine the main claim. The only minor softness is that full proof details and any hidden factors in the algorithm efficiency are not visible from the abstract alone, though the stated claims are self-contained. This is aimed at researchers in private optimization and multi-class learning who care about label privacy. A reader working on tight DP bounds or local DP algorithms would find the resolution useful. It deserves serious referee time because it closes a specific gap with matching upper and lower bounds rather than incremental improvement.

Referee Report

0 major / 0 minor

Summary. The manuscript studies stochastic convex optimization under local label differential privacy (L-LDP), with public features and locally randomized labels. Prior work obtained excess-risk bounds linear in the label-space size K. The authors give a new non-interactive algorithm attaining O(sqrt(K/(ε n))) excess risk in the high-privacy regime (ε ≤ 1) and O(sqrt(K/(e^ε n))) in the medium-privacy regime (1 ≤ ε ≤ ln K), a quadratic improvement in K. They also prove a matching information-theoretic lower bound that holds for all sufficiently large n.

Significance. If the stated bounds and proofs hold, the result is significant: it supplies the first tight characterization of excess risk for L-LDP SCO across the high- and medium-privacy regimes, replacing an O(K) dependence with O(sqrt(K)) while retaining an efficient non-interactive algorithm. The combination of an explicit algorithmic construction and an independent matching lower bound constitutes a strong contribution to the literature on private convex optimization.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their thorough review and positive recommendation to accept the manuscript. We are glad that the referee finds the tight characterization of excess risk under L-LDP SCO, including the quadratic improvement in the dependence on the label space size K, to be a significant contribution.

Circularity Check

0 steps flagged

No significant circularity; new algorithm and independent lower bound

full rationale

The paper's central claims rest on a novel non-interactive algorithm construction achieving the stated O(sqrt(K)) improvement and a separately proven information-theoretic lower bound that holds for sufficiently large n. These steps are self-contained: the algorithm is explicitly constructed in the manuscript rather than fitted or renamed from prior outputs, and the lower bound is derived directly from the L-LDP model assumptions without reducing to a self-citation chain or definitional equivalence. The citation to Ghazi et al. (2021) merely sets the baseline linear bound being improved upon and does not load-bear the new quadratic result or the matching lower bound. No self-definitional, fitted-input, or ansatz-smuggling patterns appear in the derivation chain.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Based solely on the abstract, no explicit free parameters, ad-hoc axioms, or invented entities are identifiable; the work builds on standard assumptions of stochastic convex optimization and local differential privacy.

pith-pipeline@v0.9.0 · 5607 in / 1239 out tokens · 80088 ms · 2026-05-12T03:22:16.735995+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

15 extracted references · 15 canonical work pages

[1]

URL https://doi.org/10.1109/FOCS.2014.56

doi: 10.1109/FOCS.2014.56. URL https://doi.org/10.1109/FOCS.2014.56. Sébastien Bubeck. Convex Optimization: Algorithms and Complexity.Found. Trends Mach. Learn., 8(3-4): 231–357,

work page doi:10.1109/focs.2014.56 2014
[2]

URLhttps://doi.org/10.1109/FOCS.2013.53

doi: 10.1109/FOCS.2013.53. URLhttps://doi.org/10.1109/FOCS.2013.53. John C. Duchi, Michael I. Jordan, and Martin J. Wainwright. Minimax optimal procedures for locally private estimation.J. ASA, 113(521):182–201,

work page doi:10.1109/focs.2013.53 2013
[3]

M., KUCUKELBIR, A., MCAULIFFE, J

doi: 10.1080/01621459.2017.1389735. URLhttps: //doi.org/10.1080/01621459.2017.1389735. Cynthia Dwork and Aaron Roth. The algorithmic foundations of differential privacy.Found. Trends Theor. Comput. Sci., 9(3-4):211–407,

work page doi:10.1080/01621459.2017.1389735 2017
[4]

The algorithmic foundations of differential privacy

doi: 10.1561/0400000042. URLhttps://doi.org/10.1561/ 0400000042. 7 Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam D. Smith. Calibrating noise to sensitivity in private data analysis. InTCC, pp. 265–284,

work page doi:10.1561/0400000042
[5]

URLhttps://doi.org/ 10.1007/11681878_14

doi: 10.1007/11681878\_14. URLhttps://doi.org/ 10.1007/11681878_14. Úlfar Erlingsson, Vasyl Pihur, and Aleksandra Korolova. Rappor: Randomized aggregatable privacy- preserving ordinal response. InCCS, pp. 1054–1067,

work page doi:10.1007/11681878
[6]

RAPPOR: Randomized Aggregatable Privacy-Preserving Ordinal Response , year =

doi: 10.1145/2660267.2660348. URLhttp: //dx.doi.org/10.1145/2660267.2660348. Vitaly Feldman and Kunal Talwar. Lossless compression of efficient private local randomizers. InICML, pp. 3208–3219,

work page doi:10.1145/2660267.2660348
[7]

LabelDP-Pro: learning with label differential privacy via projections

Badih Ghazi, Yangsibo Huang, Pritish Kamath, Ravi Kumar, Pasin Manurangsi, and Chiyuan Zhang. LabelDP-Pro: learning with label differential privacy via projections. InICLR, 2024a. URL https: //openreview.net/forum?id=JnYaF3vv3G. Badih Ghazi, Pritish Kamath, Ravi Kumar, Pasin Manurangsi, Raghu Meka, and Chiyuan Zhang. On convex optimization with semi-sensi...

work page 1916
[8]

Frontiers in Astronomy and Space Sciences , keywords =

doi: 10.1109/FOCS.2019.00015. URLhttps://doi.org/10.1109/ FOCS.2019.00015. Shiva Prasad Kasiviswanathan, Homin K. Lee, Kobbi Nissim, Sofya Raskhodnikova, and Adam D. Smith. What can we learn privately?SIAM J. Comput., 40(3):793–826,

work page doi:10.1109/focs.2019.00015 2019
[9]

Lee, Kobbi Nissim, Sofya Raskhod- nikova, and Adam Smith

doi: 10.1137/090756090. URL https://doi.org/10.1137/090756090. Shai Shalev-Shwartz, Ohad Shamir, Nathan Srebro, and Karthik Sridharan. Stochastic convex optimization. InCOLT,

work page doi:10.1137/090756090
[10]

Xinyu Tang, Milad Nasr, Saeed Mahloujifar, Virat Shejwalkar, Liwei Song, Amir Houmansadr, and Prateek Mittal

URLhttp://www.cs.mcgill.ca/%7Ecolt2009/papers/018.pdf#page=1. Xinyu Tang, Milad Nasr, Saeed Mahloujifar, Virat Shejwalkar, Liwei Song, Amir Houmansadr, and Prateek Mittal. Machine learning with differentially private labels: Mechanisms and frameworks.PETS, 2022(4):332– 350,

work page 2022
[11]

URLhttps://doi.org/10.56553/popets-2022-0112

doi: 10.56553/POPETS-2022-0112. URLhttps://doi.org/10.56553/popets-2022-0112. Shaowei Wang, Liusheng Huang, Pengzhan Wang, Yiwen Nie, Hongli Xu, Wei Yang, Xiang-Yang Li, and Chunming Qiao. Mutual information optimally local private discrete distribution estimation.arXiv, 1607.08025,

work page doi:10.56553/popets-2022-0112 2022
[12]

URLhttps: //doi.org/10.1109/TIT.2018.2809790

doi: 10.1109/TIT.2018.2809790. URLhttps: //doi.org/10.1109/TIT.2018.2809790. A Variant of the Algorithm viad-Subset Selection Mechanism In this section, we sketch an argument showing that we may replace the randomizer in Section 3 with the randomizer from (Ye & Barg, 2018; Wang et al.,

work page doi:10.1109/tit.2018.2809790 2018
[13]

Throughout this section, we assumeε≤lnKfor simplicity

and still achieve the same asymptotic excess risk guarantee, albeit via slightly more involved calculations. Throughout this section, we assumeε≤lnKfor simplicity. (Otherwise, we may use the analysis forε= lnKinstead.) Recall thed-Subset Selection(Ye & Barg, 2018; Wang et al., 2016), whered∈[K−1]is a parameter. We use this to construct theε-L-LDPlocal ran...

work page 2018
[14]

We note that the original paper (Duchi et al.,

We briefly discuss anε-L-LDP algorithm based on a vector randomization algorithm of (Duchi et al., 2013), which is stated below. We note that the original paper (Duchi et al.,

work page 2013
[15]

Theorem 4((Duchi et al., 2013)).LetQ be a (publicly known) d-dimensional subspace of Rm and L>

only proves the following for Q =Rd; however, it can be easily extend to anyd-dimensional subspaceQ by writing the input vectorv using the basis ofQand applying the original algorithm on the latter. Theorem 4((Duchi et al., 2013)).LetQ be a (publicly known) d-dimensional subspace of Rm and L>

work page 2013