pith. sign in

arxiv: 2606.19859 · v1 · pith:R4XYRVULnew · submitted 2026-06-18 · 💻 cs.IT · cs.LG· math.IT· math.PR· math.ST· stat.TH

Doeblin Curves

Pith reviewed 2026-06-26 15:46 UTC · model grok-4.3

classification 💻 cs.IT cs.LGmath.ITmath.PRmath.STstat.TH
keywords Doeblin curvesinformation contractionMarkov kernelsgeneralization boundsnoisy computationdifferential privacymulti-way contraction
0
0 comments X

The pith

Doeblin curves quantify contraction for Markov kernels even when the standard Doeblin coefficient is zero.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces Doeblin curves as nonlinear functions that measure how a Markov kernel contracts collections of input distributions at given levels of divergence and power. This construction produces explicit multi-way contraction bounds without requiring the usual Doeblin coefficient to stay bounded away from zero. The resulting bounds are then applied to obtain generalization guarantees in noisy optimization, error bounds for noisy circuit computation, and differential privacy statements for iterative algorithms, all in broader group or domain settings than previously available.

Core claim

A Doeblin curve is a nonlinear function that quantifies the contraction behavior of a Markov kernel on collections of input distributions at specific levels of divergence and power. Through a new variational characterization of Doeblin coefficients, the paper derives properties, upper and lower bounds, and power-constrained versions of these curves, showing that they deliver non-vacuous contraction guarantees for channels whose ordinary Doeblin coefficient equals zero.

What carries the argument

Doeblin curve: a nonlinear function that quantifies contraction of a Markov kernel on collections of input distributions at fixed divergence and power levels.

If this is right

  • Generalization bounds for noisy iterative optimization extend to wider domains and group settings.
  • Error bounds for reliable computation with noisy circuits apply beyond single-instance cases.
  • Differential privacy guarantees for online iterative algorithms capture finer contraction at specific operating points.
  • Power-constrained versions of the curves allow contraction statements under input restrictions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The variational characterization may permit numerical approximation of contraction rates for channels too large for direct Doeblin-coefficient computation.
  • Doeblin curves could be composed across product channels to obtain joint contraction statements without separate coefficient calculations.
  • The same nonlinear viewpoint might adapt to other distance measures such as Renyi divergence or Wasserstein distance where linear coefficients also vanish.

Load-bearing premise

Concepts of nonlinear information contraction are enough to define Doeblin curves that still produce usable multi-way contraction bounds when the ordinary Doeblin coefficient is zero.

What would settle it

An explicit channel whose Doeblin coefficient is zero and whose Doeblin curve remains flat (no contraction) at the divergence and power values relevant to a target application would show the claimed non-vacuous bounds do not hold.

Figures

Figures reproduced from arXiv: 2606.19859 by Anuran Makur, Dongmin Lee, Japneet Singh, William Lu.

Figure 1
Figure 1. Figure 1: Randomly sampled joint range F(K; R 5×5 sto ) for a 5 × 5 Markov matrix K, shaded in blue. By submultiplicativity of complementary Doeblin coeffi￾cients [22, Section 4], [19, Theorem 1], we have ρ(WK) ≤ ρ(W)ρ(K). Hence, the Doeblin curve and joint range satisfy the outer bounds FK(t; G) ≤ ρ(K)t, F(K; G) ⊆ {(t, y) ∈ [0, 1]2 : y ≤ ρ(K)t}, (10) and since ρ(K) ≤ 1, we have FK(t; G) ≤ t, F(K; G) ⊆ {(t, y) ∈ [0,… view at source ↗
Figure 2
Figure 2. Figure 2: Average extremal power-constrained Doeblin curves for the Gaussian, [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗
read the original abstract

Recent research on Doeblin coefficients has shed light on their usefulness as a multi-way generalization of the Dobrushin contraction coefficient for TV distance, in a separate vein from their classic role in the theory of Markov chain ergodicity. However, strong conditions, such as being bounded away from 0, are typically necessary for Doeblin coefficients to establish the existence of information contraction. Building on recently formulated concepts of nonlinear information contraction, we aim to propose a finer-grained Doeblin-based characterization of multi-way contraction behavior which yields non-vacuous contraction guarantees even for channels whose Doeblin coefficient is 0. To this end, we introduce the notion of a Doeblin curve -- a nonlinear function which quantifies the contraction behavior of a Markov kernel on collections of input distributions at specific levels of divergence and power. Through the course of our analysis, we develop a new variational characterization of Doeblin coefficients, present several properties of Doeblin curves, define several versions of power-constrained Doeblin curves, and derive upper and lower bounds using our aforementioned variational characterization. We then utilize these results in diverse areas, including generalization bounds for noisy iterative optimization, error bounds for reliable computation with noisy circuits, and differential privacy guarantees for online iterative algorithms. In particular, we extend results in these areas to broader domains or group settings, leveraging Doeblin curves to reveal finer-grained contraction phenomena than Doeblin coefficients.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper introduces Doeblin curves, defined as nonlinear functions that quantify multi-way contraction of a Markov kernel on collections of input distributions at fixed divergence and power levels. Building on nonlinear information contraction ideas, it provides a new variational characterization of Doeblin coefficients, derives properties and bounds for (power-constrained) Doeblin curves, and applies the framework to obtain generalization bounds for noisy optimization, error bounds for noisy circuits, and differential privacy guarantees for iterative algorithms, claiming non-vacuous contraction even when the ordinary Doeblin coefficient is zero.

Significance. If the central claims hold, the work would meaningfully extend contraction-coefficient techniques to kernels with vanishing global Doeblin coefficient by replacing a single scalar with a curve indexed by divergence level; the listed applications would then cover broader regimes than standard Doeblin-coefficient arguments permit.

major comments (2)
  1. [§3] §3 (variational characterization): the manuscript must exhibit an explicit channel example (or general construction) in which the Doeblin curve evaluated at a positive divergence level yields a contraction factor strictly less than 1 while the ordinary Doeblin coefficient is exactly zero; without this, the non-vacuous guarantee asserted in the abstract remains unverified.
  2. [§4–5] §4–5 (properties and bounds): the upper and lower bounds derived from the variational form must be shown to be strictly tighter than the trivial bound of 1 at the operating points used in the applications; if the bounds collapse to the classical Doeblin coefficient when the latter is zero, the claimed extension fails.
minor comments (2)
  1. Notation for the power-constrained variants should be introduced once and used consistently; several symbols appear to be redefined across sections.
  2. [§6–8] The applications in §6–8 would benefit from a short table comparing the new bounds against the classical Doeblin-coefficient bounds on the same examples.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thorough review and valuable suggestions. We address the two major comments point by point below and will incorporate the requested clarifications and examples into a revised manuscript.

read point-by-point responses
  1. Referee: [§3] §3 (variational characterization): the manuscript must exhibit an explicit channel example (or general construction) in which the Doeblin curve evaluated at a positive divergence level yields a contraction factor strictly less than 1 while the ordinary Doeblin coefficient is exactly zero; without this, the non-vacuous guarantee asserted in the abstract remains unverified.

    Authors: We agree that an explicit example is required to verify the non-vacuous claim. The variational characterization in §3 already implies the existence of such kernels (by taking collections of distributions at positive divergence where the multi-way contraction is strictly less than the global Doeblin coefficient of zero), but we did not include a concrete construction. In the revision we will add a specific example, e.g., a carefully parameterized binary erasure channel or a constructed finite-state kernel whose Doeblin coefficient vanishes while the curve at a chosen positive divergence level yields a factor <1, computed directly from the variational form. revision: yes

  2. Referee: [§4–5] §4–5 (properties and bounds): the upper and lower bounds derived from the variational form must be shown to be strictly tighter than the trivial bound of 1 at the operating points used in the applications; if the bounds collapse to the classical Doeblin coefficient when the latter is zero, the claimed extension fails.

    Authors: We accept that the applications in §6–8 must be accompanied by explicit verification that the derived bounds are strictly less than 1 at the relevant operating points. The power-constrained Doeblin curves are designed precisely so that they remain <1 at positive divergence levels even when the ordinary coefficient is zero; however, the current text does not numerically or analytically confirm this at the exact parameter values used in the generalization, circuit, and DP bounds. In the revision we will insert short calculations or plots in §4–5 (and cross-references in the applications) demonstrating that the variational upper and lower bounds are strictly tighter than 1 at those points. revision: yes

Circularity Check

0 steps flagged

No circularity: Doeblin curves defined from independent nonlinear contraction concepts

full rationale

The paper builds explicitly on 'recently formulated concepts of nonlinear information contraction' to introduce Doeblin curves as a nonlinear function quantifying contraction on collections of input distributions. It develops a new variational characterization, properties, power-constrained versions, and bounds, then applies them to generalization, computation, and privacy. No quoted equations or steps reduce the target contraction guarantee to a fitted parameter, self-definition, or self-citation chain by construction. The variational form is presented as a development enabling finer-grained analysis when the standard Doeblin coefficient is zero, without evidence that it collapses to the ordinary infimum. The derivation chain remains self-contained with independent content.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claim rests on the definition of Doeblin curves and the assumption that nonlinear information contraction concepts extend to level-specific multi-way bounds; no free parameters or external data are mentioned.

axioms (1)
  • domain assumption Nonlinear information contraction concepts recently formulated in the literature
    Abstract states the work builds directly on these concepts to define the curves.
invented entities (1)
  • Doeblin curve no independent evidence
    purpose: Nonlinear function quantifying contraction behavior of a Markov kernel on collections of distributions at specific divergence and power levels
    Newly introduced object whose properties and bounds are developed in the paper.

pith-pipeline@v0.9.1-grok · 5784 in / 1227 out tokens · 14784 ms · 2026-06-26T15:46:49.259096+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

87 extracted references · 1 linked inside Pith

  1. [1]

    On Doeblin curves and their properties,

    W. Lu, A. Makur, and J. Singh, “On Doeblin curves and their properties,” inProceedings of the IEEE International Symposium on Information Theory (ISIT), Athens, Greece, July 7–12 2024, pp. 2544–2549

  2. [2]

    Information-type measures of difference of probability distri- butions and indirect observations,

    I. Csiszár, “Information-type measures of difference of probability distri- butions and indirect observations,”Studia Scientiarum Mathematicarum Hungarica, vol. 2, pp. 299–318, 1967

  3. [3]

    A class of measures of informativity of observation channels,

    I. Csiszár, “A class of measures of informativity of observation channels,” Periodica Mathematica Hungarica, vol. 2, no. 1–4, pp. 191–213, March 1972

  4. [4]

    A connection between correlation and contingency,

    H. O. Hirschfeld, “A connection between correlation and contingency,” Mathematical Proceedings of the Cambridge Philosophical Society, vol. 31, no. 4, pp. 520—-524, October 1935

  5. [5]

    Central limit theorem for nonstationary Markov chains. i,

    R. L. Dobrushin, “Central limit theorem for nonstationary Markov chains. i,”Theory of Probability & Its Applications, vol. 1, no. 1, pp. 65–80, 1956

  6. [6]

    On measures of dependence,

    A. Rényi, “On measures of dependence,”Acta Mathematica Academiae Scientiarum Hungarica, vol. 10, no. 3–4, pp. 441–451, September 1959

  7. [7]

    Spreading of sets in product spaces and hypercontraction of the Markov operator,

    R. Ahlswede and P. Gács, “Spreading of sets in product spaces and hypercontraction of the Markov operator,”The Annals of Probability, vol. 4, no. 6, pp. 925–939, December 1976

  8. [8]

    J. E. Cohen, J. H. B. Kemperman, and G. Zb ˘aganu,Comparisons of Stochastic Matrices with Applications in Information Theory, Statistics, Economics and Population Sciences. Boston, MA, USA: Birkhäuser, 1998

  9. [9]

    Signal propagation and noisy circuits,

    W. S. Evans and L. J. Schulman, “Signal propagation and noisy circuits,” IEEE Transactions on Information Theory, vol. 45, no. 7, pp. 2367–2373, November 1999

  10. [10]

    On hypercon- tractivity and a data processing inequality,

    V . Anantharam, A. A. Gohari, S. Kamath, and C. Nair, “On hypercon- tractivity and a data processing inequality,” inProceedings of the IEEE International Symposium on Information Theory (ISIT), Honolulu, HI, USA, June 29–July 4 2014, pp. 3022–3026

  11. [11]

    Bounds between contraction coefficients,

    A. Makur and L. Zheng, “Bounds between contraction coefficients,” in Proceedings of the 53rd Annual Allerton Conference on Communication, Control, and Computing, Monticello, IL, USA, September 29–October 2 2015, pp. 1422–1429

  12. [12]

    Strong data processing inequalities and Φ-Sobolev inequalities for discrete channels,

    M. Raginsky, “Strong data processing inequalities and Φ-Sobolev inequalities for discrete channels,”IEEE Transactions on Information Theory, vol. 62, no. 6, pp. 3355–3389, June 2016

  13. [13]

    Strong data-processing inequalities for channels and Bayesian networks,

    Y . Polyanskiy and Y . Wu, “Strong data-processing inequalities for channels and Bayesian networks,” inConvexity and Concentration, E. Carlen, M. Madiman, and E. M. Werner, Eds. New York, NY , USA: Springer, 2017, pp. 211–249

  14. [14]

    Strong data processing inequal- ities for input constrained additive noise channels,

    F. P. Calmon, Y . Polyanskiy, and Y . Wu, “Strong data processing inequal- ities for input constrained additive noise channels,”IEEE Transactions on Information Theory, vol. 64, no. 3, pp. 1879–1892, March 2018

  15. [15]

    Information contraction and decomposition,

    A. Makur, “Information contraction and decomposition,” Sc.D. Thesis in Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA, USA, May 2019

  16. [16]

    Comparison of contraction coefficients for f-divergences,

    A. Makur and L. Zheng, “Comparison of contraction coefficients for f-divergences,”Problems of Information Transmission, vol. 56, no. 2, pp. 103–156, April 2020

  17. [17]

    Universal features for high-dimensional learning and inference,

    S.-L. Huang, A. Makur, G. W. Wornell, and L. Zheng, “Universal features for high-dimensional learning and inference,”Foundations and Trends in Communications and Information Theory, vol. 21, no. 1–2, pp. 1–299, February 2024

  18. [18]

    Contraction coefficients of product symmetric channels,

    D. Lee and A. Makur, “Contraction coefficients of product symmetric channels,” inProceedings of the 60th Annual Allerton Conference on Communication, Control, and Computing, Urbana, IL, USA, September 24–27 2024, pp. 1–8

  19. [19]

    Doeblin coefficients and related measures,

    A. Makur and J. Singh, “Doeblin coefficients and related measures,” IEEE Transactions on Information Theory, vol. 70, no. 7, pp. 4667–4692, July 2024

  20. [20]

    Sur les propriétés asymptotiques de mouvement régis par certains types de chaînes simples,

    W. Doeblin, “Sur les propriétés asymptotiques de mouvement régis par certains types de chaînes simples,”Bulletin Mathématique de la Société Roumaine des Sciences, vol. 39, no. 1, pp. 57–115, 1937, in French

  21. [21]

    Exposé de la théorie des chaînes simples constantes de Markov à un nombre fini d’états,

    W. Doeblin, “Exposé de la théorie des chaînes simples constantes de Markov à un nombre fini d’états,”Revue Mathématique de l’Union Interbalkanique, vol. 2, pp. 77–105, 1938, in French

  22. [22]

    Approximation of sojourn-times via maximal couplings: motif frequency distributions,

    M. E. Lladser and S. R. Chestnut, “Approximation of sojourn-times via maximal couplings: motif frequency distributions,”Journal of Mathematical Biology, vol. 69, no. 1, pp. 147–182, July 2014

  23. [23]

    General state space Markov chains and MCMC algorithms,

    G. O. Roberts and J. S. Rosenthal, “General state space Markov chains and MCMC algorithms,”Probability Surveys, vol. 1, pp. 20–71, 2004

  24. [24]

    Mathematical aspects of mixing times in Markov chains,

    R. Montenegro and P. Tetali, “Mathematical aspects of mixing times in Markov chains,”Foundations and Trends in Theoretical Computer Science, vol. 1, no. 3, pp. 237–354, 2006

  25. [25]

    Dissipation of information in channels with input constraints,

    Y . Polyanskiy and Y . Wu, “Dissipation of information in channels with input constraints,”IEEE Transactions on Information Theory, vol. 62, no. 1, pp. 35–55, January 2016

  26. [26]

    A stable limit theorem for Markov chains,

    S. R. Kimbleton, “A stable limit theorem for Markov chains,”The Annals of Mathematical Statistics, vol. 40, no. 4, pp. 1467–1473, August 1969

  27. [27]

    Iterated random maps and some classes of Markov processes,

    R. Bhattacharya and E. C. Waymire, “Iterated random maps and some classes of Markov processes,” inStochastic Processes: Theory and Methods, ser. Handbook of Statistics, D. N. Shanbhag and C. R. Rao, Eds. Amsterdam, Netherlands: Elsevier, 2001, vol. 19, pp. 145–170

  28. [28]

    Occupancy distributions in Markov chains via Doeblin’s ergodicity coefficient,

    S. Chestnut and M. E. Lladser, “Occupancy distributions in Markov chains via Doeblin’s ergodicity coefficient,” inProceedings of the 21st International Meeting on Probabilistic, Combinatorial, and Asymptotic Methods in the Analysis of Algorithms, Vienna, Austria, June 28–July 2 2010, pp. 79–92. 41

  29. [29]

    Coding for positive rate in the source model key agreement problem,

    A. Gohari, O. Günlü, and G. Kramer, “Coding for positive rate in the source model key agreement problem,”IEEE Transactions on Information Theory, vol. 66, no. 10, pp. 6303–6323, October 2020

  30. [30]

    Coding theorems for noisy permutation channels,

    A. Makur, “Coding theorems for noisy permutation channels,”IEEE Transactions on Information Theory, vol. 66, no. 11, pp. 6723–6748, November 2020

  31. [31]

    Change detection of Markov kernels with unknown pre and post change kernel,

    H. Chen, J. Tang, and A. Gupta, “Change detection of Markov kernels with unknown pre and post change kernel,” inProceedings of the 61st IEEE Conference on Decision and Control (CDC), Cancún, Mexico, December 6–9 2022, pp. 4814–4820

  32. [32]

    Finite-time analysis of round-robin Kullback-Leibler upper confidence bounds for optimal adaptive allocation with multiple plays and Markovian rewards,

    V . Moulos, “Finite-time analysis of round-robin Kullback-Leibler upper confidence bounds for optimal adaptive allocation with multiple plays and Markovian rewards,” inProceedings of the Advances in Neural Information Processing Systems 33 (NeurIPS), December 6–12 2020, pp. 7863–7874

  33. [33]

    Minorization conditions and convergence rates for Markov chain Monte Carlo,

    J. S. Rosenthal, “Minorization conditions and convergence rates for Markov chain Monte Carlo,”Journal of the American Statistical Association, vol. 90, no. 430, pp. 558–566, June 1995

  34. [34]

    Éléments d’une théorie générale des chaînes simples constantes de Markoff,

    W. Doeblin, “Éléments d’une théorie générale des chaînes simples constantes de Markoff,”Annales scientifiques de l’École Normale Supérieure, Serie 3, vol. 57, pp. 61–111, 1940

  35. [35]

    Prefixes and the entropy rate for long- range sources,

    I. Kontoyiannis and Y . M. Suhov, “Prefixes and the entropy rate for long- range sources,” inProceedings of the IEEE International Symposium on Information Theory (ISIT), Trondheim, Norway, June 27–July 1 1994, p. 194

  36. [36]

    Comparison of channels: Criteria for domination by a symmetric channel,

    A. Makur and Y . Polyanskiy, “Comparison of channels: Criteria for domination by a symmetric channel,”IEEE Transactions on Information Theory, vol. 64, no. 8, pp. 5704–5725, August 2018

  37. [37]

    On an extension of the notion of f-divergence,

    A. A. Gushchin, “On an extension of the notion of f-divergence,”Theory of Probability & Its Applications, vol. 52, no. 3, pp. 439–455, 2008

  38. [38]

    Information processing equalities and the information–risk bridge,

    R. C. Williamson and Z. Cranko, “Information processing equalities and the information–risk bridge,”Journal of Machine Learning Research, vol. 25, no. 103, 2024

  39. [39]

    An operational approach to information leakage,

    I. Issa, A. B. Wagner, and S. Kamath, “An operational approach to information leakage,”IEEE Transactions on Information Theory, vol. 66, no. 3, pp. 1625–1657, March 2020

  40. [40]

    Bounds on maximal leakage over Bayesian networks,

    A. Makur and J. Singh, “Bounds on maximal leakage over Bayesian networks,” inProceedings of the IEEE International Symposium on Information Theory (ISIT), Ann Arbor, MI, USA, June 22–27 2025, pp. 1–6

  41. [41]

    Generalization bounds for noisy iterative algorithms using properties of additive noise channels,

    H. Wang, R. Gao, and F. P. Calmon, “Generalization bounds for noisy iterative algorithms using properties of additive noise channels,”Journal of Machine Learning Research, vol. 24, no. 1, pp. 1–43, January 2023

  42. [42]

    Local differential privacy is equivalent to contraction of an f-divergence,

    S. Asoodeh, M. Aliakbarpour, and F. P. Calmon, “Local differential privacy is equivalent to contraction of an f-divergence,” inProceedings of the IEEE International Symposium on Information Theory (ISIT), Melbourne, Australia, July 12–20 2021, pp. 545–550

  43. [43]

    LOCO: Adaptive exploration in reinforcement learning via local estimation of contraction coefficients,

    M. Diaz, L. Paull, and P. S. Castro, “LOCO: Adaptive exploration in reinforcement learning via local estimation of contraction coefficients,” inSelf-Supervision for Reinforcement Learning Workshop - ICLR, May 7 2021, pp. 1–18

  44. [44]

    Hiriart-Urruty and C

    J.-B. Hiriart-Urruty and C. Lemaréchal,Fundamentals of Convex Analysis, ser. Grundlehren Text Editions. Berlin, Heidelberg, Germany: Springer, 2001

  45. [45]

    Çinlar,Probability and Stochastics

    E. Çinlar,Probability and Stochastics. New York, NY , USA: Springer, 2011

  46. [46]

    Polyanskiy and Y

    Y . Polyanskiy and Y . Wu,Information Theory: From Coding to Learning. Cambridge, UK: Cambridge University Press, 2025

  47. [47]

    Thorisson,Coupling, Stationarity, and Regeneration

    H. Thorisson,Coupling, Stationarity, and Regeneration. New York, NY , USA: Springer New York, 2000

  48. [48]

    On properties of Doeblin coefficients,

    A. Makur and J. Singh, “On properties of Doeblin coefficients,” in Proceedings of the IEEE International Symposium on Information Theory (ISIT), Taipei, Taiwan, June 25–30 2023, pp. 72–77

  49. [49]

    Kallenberg,Foundations of Modern Probability, 3rd ed

    O. Kallenberg,Foundations of Modern Probability, 3rd ed. Cham, Switzerland: Springer, 2021

  50. [50]

    Towards the general definition of the amount of information,

    I. M. Gelfand, A. N. Kolmogorov, and A. M. Yaglom, “Towards the general definition of the amount of information,”Doklady Akademii Nauk SSSR, vol. 111, pp. 745–748, 1956, in Russian

  51. [51]

    On a Gel’fand-Yaglom-Peres theorem for f- divergences,

    G. L. Gilardoni, “On a Gel’fand-Yaglom-Peres theorem for f- divergences,” November 2009, arXiv:0911.1934 [cs.IT]. [Online]. Available: https://arxiv.org/abs/0911.1934

  52. [52]

    Ueber die kleinste kugel, die eine räumliche figur einschliesst

    H. Jung, “Ueber die kleinste kugel, die eine räumliche figur einschliesst.” Journal für die reine und angewandte Mathematik, vol. 123, pp. 241–257, 1901, in German

  53. [53]

    On the exact value of Jung constants of Orlicz sequence spaces,

    Y . Q. Yan, “On the exact value of Jung constants of Orlicz sequence spaces,”Collectanea Mathematica, vol. 55, no. 2, pp. 163–170, 2004

  54. [54]

    Convex regions and projections in Minkowski spaces,

    F. Bohnenblust, “Convex regions and projections in Minkowski spaces,” Annals of Mathematics, vol. 39, no. 2, pp. 301–308, April 1938

  55. [55]

    On a q-central limit theorem consistent with nonextensive statistical mechanics,

    S. Umarov, C. Tsallis, and S. Steinberg, “On a q-central limit theorem consistent with nonextensive statistical mechanics,”Milan Journal of Mathematics, vol. 76, pp. 307–328, 2008

  56. [56]

    Non-convex learning via stochastic gradient Langevin dynamics: a nonasymptotic analysis,

    M. Raginsky, A. Rakhlin, and M. Telgarsky, “Non-convex learning via stochastic gradient Langevin dynamics: a nonasymptotic analysis,” in Proceedings of the 2017 Conference on Learning Theory, Amsterdam, Netherlands, July 7–10 2017, pp. 1674–1703

  57. [57]

    Probabilistic logics and the synthesis of reliable organisms from unreliable components,

    J. von Neumann, “Probabilistic logics and the synthesis of reliable organisms from unreliable components,” inAutomata Studies, ser. Annals of Mathematics Studies, C. E. Shannon and J. McCarthy, Eds. Princeton, NJ, USA: Princeton University Press, 1956, vol. 34, pp. 43–98

  58. [58]

    On the maximum tolerable noise for reliable computation by formulas,

    B. Hajek and T. Weller, “On the maximum tolerable noise for reliable computation by formulas,”IEEE Transactions on Infomation Theory, vol. 37, no. 2, pp. 388–391, March 1991

  59. [59]

    On the maximum tolerable noise of k-input gates for reliable computation by formulas,

    W. S. Evans and L. J. Schulman, “On the maximum tolerable noise of k-input gates for reliable computation by formulas,”IEEE Transactions on Information Theory, vol. 49, no. 11, pp. 3094–3098, November 2003

  60. [60]

    Reliable computation by large-alphabet formulas in the presence of noise,

    A. K. Tan, M. H. Ho, and I. L. Chuang, “Reliable computation by large-alphabet formulas in the presence of noise,”IEEE Transactions on Information Theory, vol. 70, no. 12, December 2024

  61. [61]

    D. A. Levin, Y . Peres, and E. L. Wilmer,Markov Chains and Mixing Times, 1st ed. Providence, RI, USA: American Mathematical Society, 2009

  62. [62]

    Limiting privacy breaches in privacy preserving data mining,

    A. Evfimievski, J. Gehrke, and R. Srikant, “Limiting privacy breaches in privacy preserving data mining,” inProceedings of the 22nd ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Sys- tems, San Diego, CA, USA, June 9–11 2003, pp. 211–222

  63. [63]

    What can we learn privately?

    S. P. Kasiviswanathan, H. K. Lee, K. Nissim, S. Raskhodnikova, and A. Smith, “What can we learn privately?”SIAM Journal on Computing, vol. 40, no. 3, pp. 793–826, 2011

  64. [64]

    Three variants of differential privacy: Lossless conversion and applications,

    S. Asoodeh, J. Liao, F. P. Calmon, O. Kosut, and L. Sankar, “Three variants of differential privacy: Lossless conversion and applications,” IEEE Journal on Selected Areas in Information Theory, vol. 2, no. 1, pp. 208–222, March 2021

  65. [65]

    RAPPOR: Randomized aggregatable privacy-preserving ordinal response,

    U. Erlingsson, V . Pihur, and A. Korolova, “RAPPOR: Randomized aggregatable privacy-preserving ordinal response,” inProceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security, Scottsdale, AZ, USA, November 3–7 2014, pp. 1054–1067

  66. [66]

    LDP-Fed: Federated learning with local differential privacy,

    S. Truex, L. Liu, K.-H. Chow, M. E. Gursoy, and W. Wei, “LDP-Fed: Federated learning with local differential privacy,” inProceedings of the Third ACM International Workshop on Edge Systems, Analytics and Networking, Heraklion, Greece, April 27 2020, pp. 61–66

  67. [67]

    The algorithmic foundations of differential privacy,

    C. Dwork and A. Roth, “The algorithmic foundations of differential privacy,”Foundations and Trends® in Theoretical Computer Science, vol. 9, no. 3–4, pp. 211–407, 2014

  68. [68]

    The composition theorem for differential privacy,

    P. Kairouz, S. Oh, and P. Viswanath, “The composition theorem for differential privacy,” inProceedings of the 32nd International Conference on Machine Learning (ICML), Lille, France, July 6–11 2015, pp. 1376– 1385

  69. [69]

    Privacy amplification of iterative algorithms via contraction coefficients,

    S. Asoodeh, M. Diaz, and F. P. Calmon, “Privacy amplification of iterative algorithms via contraction coefficients,” inProceedings of the IEEE International Symposium on Information Theory (ISIT), Los Angeles, CA, USA, June 21–26 2020, pp. 896–901

  70. [70]

    Uncertainty, information, and sequential experiments,

    M. H. DeGroot, “Uncertainty, information, and sequential experiments,” The Annals of Mathematical Statistics, vol. 33, no. 2, June 1962

  71. [71]

    Approximation in Sobolev spaces of nonlinear expressions involving the gradient,

    P. Hajłasz and J. Malý, “Approximation in Sobolev spaces of nonlinear expressions involving the gradient,”Arkiv för Matematik, vol. 40, no. 2, pp. 245–274, October 2002

  72. [72]

    S. P. Boyd and L. Vandenberghe,Convex Optimization. New York, NY , USA: Cambridge University Press, 2009

  73. [73]

    Talagrand meets Talagrand: Upper and lower bounds on expected soft maxima of Gaussian processes with finite index sets,

    Y . Chu and M. Raginsky, “Talagrand meets Talagrand: Upper and lower bounds on expected soft maxima of Gaussian processes with finite index sets,” inProceedings of the 37th International Conference on Algorithmic Learning Theory, Toronto, ON, Canada, February 23–26 2026, pp. 1–17

  74. [74]

    M. J. Wainwright,High-Dimensional Statistics: A Non-Asymptotic Viewpoint. Cambridge, UK: Cambridge University Press, 2019

  75. [75]

    The contraction principle as a particular case of Kleene’s fixed point theorem,

    A. Baranga, “The contraction principle as a particular case of Kleene’s fixed point theorem,”Discrete Mathematics, vol. 98, no. 1, pp. 75–79, December 1991

  76. [76]

    Individual privacy accounting via a Rényi filter,

    V . Feldman and T. Zrnic, “Individual privacy accounting via a Rényi filter,” inProceedings of the Advances in Neural Information Processing Systems 34 (NeurIPS), December 6–14 2021, pp. 28 080–28 091

  77. [77]

    Learning with user-level privacy,

    D. Levy, Z. Sun, K. Amin, S. Kale, A. Kulesza, M. Mohri, and A. T. Suresh, “Learning with user-level privacy,” inProceedings of the Advances in Neural Information Processing Systems 34 (NeurIPS), December 6–14 2021, pp. 12 466–12 479. 42

  78. [78]

    Rudin,Principles of Mathematical Analysis, 3rd ed

    W. Rudin,Principles of Mathematical Analysis, 3rd ed. New York, NY , USA: McGraw-Hill, 1976

  79. [79]

    On the spectral norm of Gaussian random matrices,

    R. Van Handel, “On the spectral norm of Gaussian random matrices,” Transactions of the American Mathematical Society, vol. 369, no. 11, pp. 8161–8178, November 2017

  80. [80]

    Vershynin,High-Dimensional Probability, 1st ed

    R. Vershynin,High-Dimensional Probability, 1st ed. Cambridge, UK: Cambridge University Press, 2018

Showing first 80 references.