pith. machine review for the scientific record. sign in

arxiv: 2605.10015 · v1 · submitted 2026-05-11 · 📊 stat.ML · cs.CR· cs.LG

Recognition: 2 theorem links

· Lean Theorem

Differentially Private Sampling from Distributions via Wasserstein Projection

Satoshi Hasegawa, Seng Pei Liew, Shokichi Takakura

Pith reviewed 2026-05-12 03:34 UTC · model grok-4.3

classification 📊 stat.ML cs.CRcs.LG
keywords differential privacysamplingWasserstein distanceprojectionminimax optimalityoptimal transportprivacy mechanism
0
0 comments X

The pith

Wasserstein projection yields a minimax optimal mechanism for differentially private sampling.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Prior approaches to differentially private sampling rely on measures like KL divergence that ignore geometry and fail when distributions have mismatched supports. This paper instead uses the Wasserstein distance to define utility, which accounts for the underlying space's structure. It introduces the Wasserstein Projection Mechanism as the optimal way to choose a private distribution by projecting the target one onto the privacy-feasible set. The work also supplies algorithms to approximate this projection efficiently along with guarantees on how well they converge.

Core claim

The central claim is that the Wasserstein Projection Mechanism, obtained by projecting the original distribution onto the set of distributions satisfying the differential privacy constraint in the Wasserstein metric, is minimax optimal for the sampling problem, and that this projection can be computed approximately via efficient algorithms with convergence guarantees.

What carries the argument

The Wasserstein Projection Mechanism, which finds the distribution in the privacy-constrained set that is closest to the target distribution under the Wasserstein distance.

Load-bearing premise

That the Wasserstein distance serves as an appropriate and superior measure of utility for private samples compared to density-ratio measures, particularly when geometry or support differences are relevant.

What would settle it

Finding a dataset or distribution where samples from the Wasserstein Projection Mechanism are demonstrably less useful for downstream tasks than those from a KL-based mechanism, or where the approximation algorithm diverges.

Figures

Figures reproduced from arXiv: 2605.10015 by Satoshi Hasegawa, Seng Pei Liew, Shokichi Takakura.

Figure 1
Figure 1. Figure 1: Results for synthetic experiments. Left: Transformed [PITH_FULL_IMAGE:figures/full_fig_p008_1.png] view at source ↗
Figure 4
Figure 4. Figure 4: Estimated distribution of check-ins for ε = 4. The true distribution is shown in the leftmost panel, and the distributions estimated by WPM and KPM are shown in the middle and right panels, respectively. Euclidean distance as a metric. Then, we evaluate the performance by the Wasserstein distance (p = 1) between the true distribution of check-ins and the estimated distribution from the released locations. … view at source ↗
read the original abstract

In this paper, we study the problem of sampling from a distribution under the constraint of differential privacy (DP). Prior works measure the utility of DP sampling with density ratio-based measures such as KL divergence. However, such formulations suffer from two key limitations: 1) they fail to capture the geometric structure of the support, and 2) they are not applicable when the supports of the distributions differ. To deal with these issues, we develop a novel framework for DP sampling with Wasserstein distance as the utility measure. In this formulation, we propose Wasserstein Projection Mechanism (WPM), a minimax optimal mechanism based on Wasserstein projection. Furthermore, we develop efficient algorithms for computing the proposed mechanisms approximately and provide convergence guarantees.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 3 minor

Summary. The paper develops a framework for differentially private sampling from distributions that uses Wasserstein distance as the utility measure instead of density-ratio measures such as KL divergence. It introduces the Wasserstein Projection Mechanism (WPM) as a minimax-optimal mechanism obtained by projecting onto the set of DP-feasible distributions, develops efficient approximation algorithms for computing the mechanism, and supplies convergence guarantees for those algorithms. The approach is motivated by the inability of prior KL-based methods to capture geometric structure or to handle mismatched supports.

Significance. If the minimax optimality and convergence results hold, the work supplies a geometrically aware DP sampling primitive that directly addresses two well-known limitations of existing mechanisms. The explicit construction of tractable algorithms together with convergence guarantees is a concrete strength that improves the prospects for practical deployment in settings where Wasserstein geometry is natural (e.g., generative modeling or distribution learning under privacy constraints).

minor comments (3)
  1. [§3.2] §3.2, Definition 3.1: the precise statement of the Wasserstein projection optimization problem would benefit from an explicit display of the feasible set and the objective; the current prose description leaves the exact program implicit.
  2. [Theorem 4.3] Theorem 4.3: the convergence guarantee is stated in terms of iteration count, but the dependence on ambient dimension and on the Lipschitz constant of the cost function is not made explicit; this information is needed to assess scalability.
  3. [§6] §6 (Experiments): the reported results are confined to low-dimensional synthetic distributions; a brief discussion of how the observed error scales with dimension or with the privacy parameter ε would strengthen the empirical section.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive assessment of our manuscript and for recommending minor revision. No specific major comments were raised in the report.

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper introduces a Wasserstein-based utility for DP sampling to address limitations of KL divergence, proposes the WPM as a minimax-optimal mechanism via projection onto the DP-feasible set, and derives approximate algorithms with convergence guarantees. No load-bearing step reduces by construction to a fitted parameter, self-definition, or self-citation chain; the framework is built from standard DP and optimal transport primitives without internal tautology. The central optimality and algorithmic claims remain independent of the inputs they are applied to.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 1 invented entities

Based solely on the abstract, the central claim rests on standard differential privacy axioms and the definition of Wasserstein distance; no free parameters or invented entities beyond the named mechanism are visible.

axioms (2)
  • domain assumption Standard definition of differential privacy (epsilon-DP)
    The entire formulation assumes the usual DP constraint on the sampling mechanism.
  • standard math Wasserstein distance is a valid metric on probability measures
    Invoked as the utility measure without further justification in the abstract.
invented entities (1)
  • Wasserstein Projection Mechanism (WPM) no independent evidence
    purpose: Minimax-optimal DP sampling mechanism using Wasserstein projection
    Newly proposed in the paper as the core contribution.

pith-pipeline@v0.9.0 · 5422 in / 1289 out tokens · 47624 ms · 2026-05-12T03:34:41.741722+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

Reference graph

Works this paper leans on

12 extracted references · 12 canonical work pages · 2 internal anchors

  1. [1]

    Constructing elastic distinguishability metrics for location privacy.arXiv preprint arXiv:1503.00756,

    Konstantinos Chatzikokolakis, Catuscia Palamidessi, and Marco Stronati. Constructing elastic distinguishability metrics for location privacy.arXiv preprint arXiv:1503.00756,

  2. [2]

    Rappor: Randomized aggregatable privacy- preserving ordinal response

    Úlfar Erlingsson, Vasyl Pihur, and Aleksandra Korolova. Rappor: Randomized aggregatable privacy- preserving ordinal response. InProceedings of the 2014 ACM SIGSAC conference on computer and communications security, pages 1054–1067,

  3. [3]

    Locally optimal private sampling: Beyond the global minimax.arXiv preprint arXiv:2510.09485,

    Hrad Ghoukasian, Bonwoo Lee, and Shahab Asoodeh. Locally optimal private sampling: Beyond the global minimax.arXiv preprint arXiv:2510.09485,

  4. [4]

    Differentially private wasserstein barycenters.arXiv preprint arXiv:2510.03021,

    Anming Gu, Sasidhar Kunapuli, Mark Bun, Edward Chien, and Kristjan Greenewald. Differentially private wasserstein barycenters.arXiv preprint arXiv:2510.03021,

  5. [5]

    Efficient Estimation of Word Representations in Vector Space

    Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. Efficient estimation of word representa- tions in vector space.arXiv preprint arXiv:1301.3781,

  6. [6]

    Learning with differentially private (sliced) wasserstein gradients.arXiv preprint arXiv:2502.01701,

    David Rodríguez-Vítores, Clément Lalanne, and Jean-Michel Loubes. Learning with differentially private (sliced) wasserstein gradients.arXiv preprint arXiv:2502.01701,

  7. [7]

    InvisibleInk: High-Utility and Low-Cost Text Generation with Differential Privacy

    Vishnu Vinod, Krishna Pillutla, and Abhradeep Guha Thakurta. Invisibleink: High-utility and low-cost text generation with differential privacy.arXiv preprint arXiv:2507.02974,

  8. [8]

    Then f( ¯mT )−f(m ∗)≤ βDp α r 2(1 + logk) T =O β α Dp r logk T !

    Choose a constant stepsize ηt ≡η:= p 2(1 + logk) βDp √ T . Then f( ¯mT )−f(m ∗)≤ βDp α r 2(1 + logk) T =O β α Dp r logk T ! . Proof. By construction, g(t) ∈∂f(m (t)) for every t. By the regret bound for the unnormalized exponentiated gradient algorithm [Shalev-Shwartz, 2025, Theorem 2.23], for anym∈ M, TX t=1 ⟨g(t), m(t) −m⟩ ≤ DgKL(m∥m(1)) η + η 2 TX t=1 ...

  9. [9]

    Moreover, the mapp7→α ∗(p)is decreasing

    d−2 2 , C d = Γ((d+ 1)/2)√πΓ(d/2) , gp(u) = (2−2u) p/2, S d(t) = Z 1 t fd(u) du, Ud,p(t) = Z 1 t gp(u)fd(u) du, H d,p = Z 1 −1 gp(u)fd(u) du, andt ∗ ∈(−1,1)is the unique solution of eε/2 −e −ε/2 Ud,p(t∗) +e −ε/2Hd,p =g p(t∗) e−ε/2 + eε/2 −e −ε/2 Sd(t∗) . Moreover, the mapp7→α ∗(p)is decreasing. The optimal mechanism Mm∗,ε can be interpreted as a generaliz...

  10. [10]

    Define Mp(t) := Z 1 −1 r(u)p µt(du) 1/p , F p(t) :=M p(t)−r(t)

    Its defining equation is equivalent to Z 1 −1 gp(u)µ t∗(p)(du) =g p(t∗(p)), equivalently, Z 1 −1 r(u)p µt∗(p)(du) 1/p =r t∗(p) . Define Mp(t) := Z 1 −1 r(u)p µt(du) 1/p , F p(t) :=M p(t)−r(t). Thent ∗(p)is the unique zero ofF p. Fix t∈(−1,1) . Since c± >0 , fd(u)>0 for u∈(−1,1) , and Z(t)>0 , µt has positive density on (−1,1). Asris positive and nonconsta...

  11. [11]

    Taking the minimum overπ∈Π(ν, µ)yields, for everyν, OT(ν, µ)−λlogN≤OT λ(ν, µ)≤OT(ν, µ).(3) Let ν∗ ∈arg min ν∈Q Wp(ν, µ), ν λ :=M λ m,ε[µ]

    Therefore, X i,j Cijπij −λlogN≤ X i,j Cijπij +λ X i,j πij logπ ij ≤ X i,j Cijπij. Taking the minimum overπ∈Π(ν, µ)yields, for everyν, OT(ν, µ)−λlogN≤OT λ(ν, µ)≤OT(ν, µ).(3) Let ν∗ ∈arg min ν∈Q Wp(ν, µ), ν λ :=M λ m,ε[µ]. Sincex7→x 1/p is strictly increasing onR +, we also have ν∗ ∈arg min ν∈Q OT(ν, µ). By definition ofM λ m,ε[µ], νλ ∈arg min ν∈Q OTλ(ν, µ)...

  12. [12]

    We observe that WPM captures the true distribution more accurately than KPM, which demonstrates the advantage of WPM in preserving the geometric structure of the data. N Experimental Details DatasetsMovielens-100K [Harper and Konstan, 2015] contains 100,000 ratings from 1000 users on 1700 movies, which is distributed under a usage license that permits res...