pith. sign in

arxiv: 2606.03033 · v1 · pith:WOF25IK5new · submitted 2026-06-02 · 💻 cs.IT · math.IT· math.ST· stat.CO· stat.ML· stat.TH

Local and Global Contraction Principles for MCMC Mixing

Pith reviewed 2026-06-28 08:39 UTC · model grok-4.3

classification 💻 cs.IT math.ITmath.STstat.COstat.MLstat.TH
keywords MCMC mixing timescontraction coefficientsE_gamma divergenceprojected Langevin Monte Carloindependent Metropolis-HastingsGaussian smoothingrandom batch samplingwarm-start convergence
0
0 comments X

The pith

Gaussian smoothing produces an explicit global contraction coefficient for projected Langevin Monte Carlo under the E_γ-divergence on compact convex domains.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a contraction-based framework for proving mixing-time bounds for Markov chain Monte Carlo algorithms using global and local contraction coefficients of Markov kernels under the E_γ-divergence with γ ≥ 1. For projected Langevin Monte Carlo, Gaussian smoothing yields an explicit global contraction coefficient that directly establishes exponential convergence to the discretized stationary distribution for general smooth, possibly non-convex potentials. The resulting rate is explicit, works with arbitrary random-batch sampling, and supplies guarantees for KL, χ², and Rényi divergences. For independent Metropolis-Hastings with unbounded importance weights, the framework switches to a local contraction coefficient on the core set where weights are bounded and shows that this coefficient controls the rejection profile, producing warm-start bounds governed by the local coefficient and the tail probability.

Core claim

The central claim is that global contraction coefficients under the E_γ-divergence can be obtained explicitly via Gaussian smoothing for projected Langevin Monte Carlo on compact convex domains, directly proving exponential convergence to the stationary distribution for smooth possibly non-convex potentials, with the rate accommodating random-batch sampling and extending to several divergences; separately, local contraction coefficients on the bounded-weight core control rejection in independent Metropolis-Hastings and yield warm-start convergence bounds even when no moment of order greater than one exists.

What carries the argument

Global and local contraction coefficients of Markov kernels under the E_γ-divergence, with Gaussian smoothing supplying the global coefficient for Langevin dynamics and the bounded-weight core supplying the local coefficient for Metropolis-Hastings.

If this is right

  • Explicit exponential convergence rates hold for projected Langevin Monte Carlo with arbitrary smooth possibly non-convex potentials.
  • The same rates supply convergence guarantees simultaneously for KL, χ², and Rényi divergences.
  • The rates remain valid under any random-batch sampling scheme.
  • For independent Metropolis-Hastings, warm-start bounds are controlled by the local contraction coefficient on the core and the tail profile H_R.
  • The bounds recover existing sharp moment-based rates whenever a finite moment of order p > 1 exists and continue to apply in heavy-tailed regimes where no such moment exists.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The local-versus-global distinction may extend to other sampling kernels whose importance weights are unbounded.
  • The explicit form of the contraction coefficient could be used to optimize step sizes or batch sizes in practice.
  • The framework might be combined with other divergence analyses to obtain hybrid mixing guarantees.
  • Numerical verification of the predicted rates on moderate-dimensional non-convex targets would test whether the constants are sharp.

Load-bearing premise

The target potential must be smooth and the domain must be compact and convex so that Gaussian smoothing produces a well-defined global contraction coefficient under the E_γ-divergence.

What would settle it

A counter-example computation on a smooth non-convex potential over a compact convex domain where the projected Langevin chain fails to exhibit the predicted exponential decay rate in E_γ-divergence to its discretized stationary distribution.

read the original abstract

We develop a contraction-based framework for proving mixing-time bounds for Markov chain Monte Carlo algorithms. The framework is built around global and local contraction coefficients of Markov kernels under the $\mathsf E_\gamma$-divergence with $\gamma\ge1$. For projected Langevin Monte Carlo on a compact convex domain, we show that Gaussian smoothing yields an explicit global contraction coefficient for the $\mathsf E_\gamma$-divergence. This gives a direct proof of exponential convergence to the discretized stationary distribution for general smooth, possibly non-convex potentials. The rate is explicit, accommodates arbitrary random-batch sampling schemes, and yields convergence guarantees for several divergences, including KL, $\chi^2$, and R\'enyi divergences. For independent Metropolis--Hastings with target $\pi$, proposal $q$, and unbounded importance weight $w=d\pi/dq$, global contraction coefficients are typically trivial. We therefore introduce a local contraction coefficient on the core $C_R=\{w\le R\}$ and prove that it controls the rejection profile on the core. This yields warm-start convergence bounds governed by the local contraction coefficient and the tail profile $H_R=\pi(w>R)$, recovering sharp existing moment-based convergence rates when $\mathbb E_q[w^p]<\infty$ for some $p>1$, while remaining effective in heavy-tailed regimes where no finite moment of order $p>1$ exists.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 3 minor

Summary. The manuscript develops a contraction-based framework for MCMC mixing-time analysis centered on global and local contraction coefficients under the E_γ-divergence (γ ≥ 1). For projected Langevin Monte Carlo on compact convex domains, Gaussian smoothing is shown to produce an explicit global contraction coefficient, yielding exponential convergence to the discretized stationary distribution for smooth, possibly non-convex potentials; the rates are explicit, accommodate arbitrary random-batch sampling, and extend to KL, χ², and Rényi divergences. For independent Metropolis-Hastings, a local contraction coefficient is introduced on the core C_R = {w ≤ R} to control the rejection profile, producing warm-start bounds governed by this coefficient and the tail profile H_R that recover sharp moment-based rates when E_q[w^p] < ∞ for p > 1 and remain effective in heavy-tailed regimes.

Significance. If the derivations hold, the framework supplies explicit, verifiable contraction coefficients and mixing bounds in regimes (non-convex potentials, heavy tails, random batches) where many existing analyses are either implicit or require stronger assumptions. The recovery of known moment-based rates as a special case and the direct extension across multiple divergences are concrete strengths. The introduction of the local contraction coefficient on C_R is a useful technical device that is independent of prior self-citations.

minor comments (3)
  1. [§2] §2 (notation): the precise definition of the E_γ-divergence and its relation to the standard Rényi family could be stated in one displayed equation for immediate reference.
  2. [§4.2] §4.2 (local coefficient): an explicit one-line example computing the local contraction coefficient for a simple two-point distribution on C_R would clarify the definition before the general proof.
  3. [Figure 1] Figure 1 (caption): the caption should explicitly note which divergence is plotted and whether the plotted quantity is the contraction coefficient itself or its logarithm.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive summary of our manuscript and the recommendation of minor revision. The referee's description accurately reflects the framework based on global and local contraction coefficients under the E_γ-divergence. No specific major comments were raised in the report.

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper defines global and local contraction coefficients for the E_γ-divergence as new quantities and then derives explicit bounds for projected Langevin Monte Carlo and independent Metropolis-Hastings under the stated assumptions (compact convex domain, smooth potentials). These definitions are applied directly to the Markov kernels without any reduction of the claimed rates or convergence statements back to fitted parameters, self-citations, or prior ansatzes by construction. The derivation chain remains self-contained against external benchmarks and does not invoke load-bearing self-citations or uniqueness theorems from the authors' prior work.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 1 invented entities

The framework rests on standard properties of Markov kernels and divergences; no free parameters or invented entities with independent evidence are visible in the abstract.

axioms (2)
  • standard math E_γ-divergence satisfies contraction properties under Markov kernels for γ ≥ 1
    Invoked to define global and local contraction coefficients.
  • domain assumption Gaussian smoothing on compact convex domains produces well-defined kernels for smooth potentials
    Required for the explicit global contraction result in projected Langevin Monte Carlo.
invented entities (1)
  • local contraction coefficient on core C_R no independent evidence
    purpose: Controls rejection profile for independent Metropolis-Hastings when global coefficient is trivial
    Newly introduced definition to handle unbounded importance weights.

pith-pipeline@v0.9.1-grok · 5787 in / 1496 out tokens · 26862 ms · 2026-06-28T08:39:07.566374+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

95 extracted references · 6 canonical work pages · 1 internal anchor

  1. [1]

    Advances in Neural Information Processing Systems , title=

    Vempala, Santosh and Wibisono, Andre , journal=. Advances in Neural Information Processing Systems , title=

  2. [2]

    Fixed Point Theory in Metric Spaces: Recent Advances and Applications , pages=

    Banach contraction principle and applications , author=. Fixed Point Theory in Metric Spaces: Recent Advances and Applications , pages=. 2018 , publisher=

  3. [3]

    Sur les op

    Banach, Stefan , journal=. Sur les op. 1922 , publisher=

  4. [4]

    Liu, Jingbo and Cuff, Paul and Verdú, Sergio , journal=. E_. 2017 , volume=

  5. [5]

    Resolving the Mixing Time of the

    Altschuler, Jason and Talwar, Kunal , booktitle =. Resolving the Mixing Time of the. 2023 , editor =

  6. [6]

    Stein and Rami Shakarchi , publisher =

    Elias M. Stein and Rami Shakarchi , publisher =. Real Analysis: Measure Theory, Integration, and Hilbert Spaces , urldate =

  7. [7]

    Contraction of

    Asoodeh, Shahab and Diaz, Mario and Calmon, Flavio P , journal=. Contraction of

  8. [8]

    International Conference on Machine Learning , pages=

    Improving the gaussian mechanism for differential privacy: Analytical calibration and optimal denoising , author=. International Conference on Machine Learning , pages=. 2018 , organization=

  9. [9]

    2017 , publisher=

    Levin, David A and Peres, Yuval , volume=. 2017 , publisher=

  10. [10]

    Coffey, William and Kalmykov, Yu P , volume=. The. 2012 , publisher=

  11. [11]

    Projected stochastic gradient

    Lamperski, Andrew , booktitle=. Projected stochastic gradient. 2021 , organization=

  12. [12]

    Analysis and geometry of

    Bakry, Dominique and Gentil, Ivan and Ledoux, Michel and others , volume=. Analysis and geometry of. 2014 , publisher=

  13. [13]

    Analysis of

    Chewi, Sinho and Erdogdu, Murat A and Li, Mufan and Shen, Ruoqi and Zhang, Shunshi , booktitle=. Analysis of. 2022 , organization=

  14. [14]

    On the convergence of

    Erdogdu, Murat A and Hosseinzadeh, Rasa , booktitle=. On the convergence of. 2021 , organization=

  15. [15]

    Convergence of

    Erdogdu, Murat A and Hosseinzadeh, Rasa and Zhang, Shunshi , booktitle=. Convergence of. 2022 , organization=

  16. [16]

    and He, Ye and Balasubramanian, Krishna and Erdogdu, Murat A

    Mousavi-Hosseini, Alireza and Farghly, Tyler K. and He, Ye and Balasubramanian, Krishna and Erdogdu, Murat A. , booktitle =. Towards a Complete Analysis of. 2023 , volume =

  17. [17]

    Towards a theory of non-log-concave sampling: first-order stationarity guarantees for

    Balasubramanian, Krishna and Chewi, Sinho and Erdogdu, Murat A and Salim, Adil and Zhang, Shunshi , booktitle=. Towards a theory of non-log-concave sampling: first-order stationarity guarantees for. 2022 , organization=

  18. [18]

    Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=

    Theoretical guarantees for approximate sampling from smooth and log-concave densities , author=. Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=. 2017 , publisher=

  19. [19]

    Convergence of

    Cheng, Xiang and Bartlett, Peter , booktitle =. Convergence of. 2018 , editor =

  20. [20]

    Sharp convergence rates for

    Cheng, Xiang and Chatterji, Niladri S and Abbasi-Yadkori, Yasin and Bartlett, Peter L and Jordan, Michael I , journal=. Sharp convergence rates for

  21. [21]

    High-dimensional Bayesian inference via the Unadjusted

    Durmus, Alain and Moulines, Eric , journal=. High-dimensional Bayesian inference via the Unadjusted

  22. [22]

    User-friendly guarantees for the

    Dalalyan, Arnak S and Karagulyan, Avetik , journal=. User-friendly guarantees for the. 2019 , publisher=

  23. [23]

    Analysis of

    Durmus, Alain and Majewski, Szymon and Miasojedow, B. Analysis of. Journal of Machine Learning Research , volume=

  24. [24]

    Unadjusted

    Nguyen, Dao and Dang, Xin and Chen, Yixin , journal=. Unadjusted. 2023 , publisher=

  25. [25]

    Proceedings of the National Academy of Sciences , volume=

    Sampling can be faster than optimization , author=. Proceedings of the National Academy of Sciences , volume=. 2019 , publisher=

  26. [26]

    Sampling from a log-concave distribution with projected

    Bubeck, S. Sampling from a log-concave distribution with projected. Discrete & Computational Geometry , volume=. 2018 , publisher=

  27. [27]

    Bayesian learning via stochastic gradient

    Welling, Max and Teh, Yee W , booktitle=. Bayesian learning via stochastic gradient. 2011 , organization=

  28. [28]

    Proceedings of the 2016 ACM SIGSAC conference on computer and communications security , pages=

    Deep learning with differential privacy , author=. Proceedings of the 2016 ACM SIGSAC conference on computer and communications security , pages=

  29. [29]

    arXiv preprint arXiv:2305.09903 , year=

    Privacy Loss of Noisy Stochastic Gradient Descent Might Converge Even for Non-Convex Losses , author=. arXiv preprint arXiv:2305.09903 , year=

  30. [30]

    Faster differentially private samplers via

    Ganesh, Arun and Talwar, Kunal , journal=. Faster differentially private samplers via

  31. [31]

    Lehec, Joseph , journal=. The. 2023 , publisher=

  32. [32]

    Global Convergence of

    Xu, Pan and Chen, Jinghui and Zou, Difan and Gu, Quanquan , booktitle =. Global Convergence of

  33. [33]

    Non-convex learning via stochastic gradient

    Raginsky, Maxim and Rakhlin, Alexander and Telgarsky, Matus , booktitle=. Non-convex learning via stochastic gradient. 2017 , organization=

  34. [34]

    The Annals of Applied Probability , volume=

    NONASYMPTOTIC BOUNDS FOR SAMPLING ALGORITHMS WITHOUT LOG-CONCAVITY , author=. The Annals of Applied Probability , volume=. 2020 , publisher=

  35. [35]

    Fast conditional mixing of

    Cheng, Xiang and Wang, Bohan and Zhang, Jingzhao and Zhu, Yusong , journal=. Fast conditional mixing of

  36. [36]

    Improved bounds for discretization of

    Mou, Wenlong and Flammarion, Nicolas and Wainwright, Martin J and Bartlett, Peter L , journal=. Improved bounds for discretization of. 2022 , publisher=

  37. [37]

    Constrained

    Zheng, Yuping and Lamperski, Andrew , booktitle =. Constrained

  38. [38]

    1991 , publisher=

    Probability with martingales , author=. 1991 , publisher=

  39. [39]

    Concentration of the

    Altschuler, Jason M and Talwar, Kunal , journal=. Concentration of the

  40. [40]

    Optimal scaling of discrete approximations to

    Roberts, Gareth O and Rosenthal, Jeffrey S , journal=. Optimal scaling of discrete approximations to

  41. [41]

    Nonconvex sampling with the

    Mangoubi, Oren and Vishnoi, Nisheeth K , booktitle =. Nonconvex sampling with the. 2019 , editor =

  42. [42]

    Beyond Log-concavity: Provable Guarantees for Sampling Multi-modal Distributions using Simulated Tempering

    Lee, Holden and Risteski, Andrej and Ge, Rong , booktitle =. Beyond Log-concavity: Provable Guarantees for Sampling Multi-modal Distributions using Simulated Tempering

  43. [43]

    Stochastic gradient and

    Cheng, Xiang and Yin, Dong and Bartlett, Peter and Jordan, Michael , booktitle=. Stochastic gradient and. 2020 , organization=

  44. [44]

    On stochastic gradient

    Chau, Ngoc Huy and Moulines,. On stochastic gradient. SIAM Journal on Mathematics of Data Science , volume=. 2021 , publisher=

  45. [45]

    Efficient constrained sampling via the mirror-

    Ahn, Kwangjun and Chewi, Sinho , journal=. Efficient constrained sampling via the mirror-

  46. [46]

    Underdamped

    Cheng, Xiang and Chatterji, Niladri S and Bartlett, Peter L and Jordan, Michael I , booktitle=. Underdamped. 2018 , organization=

  47. [47]

    On sampling from a log-concave density using kinetic

    Dalalyan, Arnak S and Riou-Durand, Lionel , journal=. On sampling from a log-concave density using kinetic

  48. [48]

    Bounding the error of discretized

    Dalalyan, Arnak S and Karagulyan, Avetik and Riou-Durand, Lionel , journal=. Bounding the error of discretized

  49. [49]

    Journal of the ACM , volume=

    Faster high-accuracy log-concave sampling via algorithmic warm starts , author=. Journal of the ACM , volume=. 2024 , publisher=

  50. [50]

    Constrained Sampling with Primal-Dual

    Chamon, Luiz FO and Karimi, Mohammad Reza and Korba, Anna , journal=. Constrained Sampling with Primal-Dual

  51. [51]

    Exponential Convergence of

    Roberts, Gareth O and Tweedie, Richard L , journal=. Exponential Convergence of. 1996 , publisher=

  52. [52]

    f -Divergence Inequalities , year=

    Sason, Igal and Verdú, Sergio , journal=. f -Divergence Inequalities , year=

  53. [53]

    Fast Convergence of -Divergence Along the Unadjusted

    Siddharth Mitra and Andre Wibisono , booktitle=. Fast Convergence of -Divergence Along the Unadjusted

  54. [54]

    On Independent Samples Along the

    Liang, Jiaming and Mitra, Siddharth and Wibisono, Andre , journal=. On Independent Samples Along the

  55. [55]

    Book draft available at

    Log-concave sampling , author=. Book draft available at

  56. [56]

    Optimizing the diffusion coefficient of overdamped

    Leli. Optimizing the diffusion of overdamped. arXiv preprint arXiv:2404.12087 , year=

  57. [57]

    Shifted composition I:

    Altschuler, Jason M and Chewi, Sinho , journal=. Shifted composition I:. 2024 , publisher=

  58. [58]

    Shifted composition II: shift

    Altschuler, Jason M and Chewi, Sinho , journal=. Shifted composition II: shift

  59. [59]

    Shifted Composition III: Local Error Framework for

    Altschuler, Jason M and Chewi, Sinho , journal=. Shifted Composition III: Local Error Framework for

  60. [60]

    2018 IEEE 59th Annual Symposium on Foundations of Computer Science (FOCS) , pages=

    Privacy amplification by iteration , author=. 2018 IEEE 59th Annual Symposium on Foundations of Computer Science (FOCS) , pages=. 2018 , organization=

  61. [61]

    Advances in Neural Information Processing Systems , volume=

    Privacy of noisy stochastic gradient descent: More iterations without more privacy loss , author=. Advances in Neural Information Processing Systems , volume=

  62. [62]

    and Özgür, Ayfer , booktitle=

    Asoodeh, Shahab and Chen, Wei-Ning and Calmon, Flavio P. and Özgür, Ayfer , booktitle=. Differentially Private Federated Learning: An Information-Theoretic Perspective , year=

  63. [63]

    , booktitle=

    Asoodeh, Shahab and Diaz, Mario and Calmon, Flavio P. , booktitle=. Privacy Amplification of Iterative Algorithms via Contraction Coefficients , year=

  64. [64]

    Supplementary Material for Exponential Convergence of Projected

    Daeijavad, Alireza and Asoodeh, Shahab , year =. Supplementary Material for Exponential Convergence of Projected

  65. [65]

    and Zheng, L

    Makur, A. and Zheng, L. , title =. 2020 , issue_date =. doi:10.1134/S0032946020020015 , journal =

  66. [66]

    IEEE Transactions on Information Theory , volume=

    Strong data processing inequalities and -Sobolev inequalities for discrete channels , author=. IEEE Transactions on Information Theory , volume=. 2016 , publisher=

  67. [67]

    Zamanlooy and S

    B. Zamanlooy and S. Asoodeh and M. Diaz and Flavio Calmon , journal=. 2024 , pages=

  68. [68]

    1998 , publisher=

    Comparisons of Stochastic Matrices, with Applications in Information Theory, Statistics, Economics, and Population Sciences , author=. 1998 , publisher=

  69. [69]

    2014 , isbn =

    Nesterov, Yurii , title =. 2014 , isbn =

  70. [70]

    Projected

    Pang, Chenxu and Wang, Xiaojie and Wu, Yue , journal=. Projected. 2025 , publisher=

  71. [71]

    Stochastic Processes and their Applications , volume=

    The tamed unadjusted Langevin algorithm , author=. Stochastic Processes and their Applications , volume=. 2019 , publisher=

  72. [72]

    Algorithmic Learning Theory , pages=

    Fast Convergence of -Divergence Along the Unadjusted Langevin Algorithm and Proximal Sampler , author=. Algorithmic Learning Theory , pages=. 2025 , organization=

  73. [73]

    The Journal of Chemical Physics , author =

    Metropolis, Nicholas and Rosenbluth, Arianna W. and Rosenbluth, Marshall N. and Teller, Augusta H. and Teller, Edward , title =. The Journal of Chemical Physics , volume =. 1953 , month =. doi:10.1063/1.1699114 , url =

  74. [74]

    W. K. Hastings , journal =

  75. [75]

    K. L. Mengersen and R. L. Tweedie , title =. The Annals of Statistics , number =. 1996 , doi =

  76. [76]

    Biometrika , volume=

    Geometric convergence and central limit theorems for multidimensional Hastings and Metropolis algorithms , author=. Biometrika , volume=. 1996 , publisher=

  77. [77]

    G. O. Roberts and A. Gelman and W. R. Gilks , title =. The Annals of Applied Probability , number =. 1997 , doi =

  78. [78]

    Conference on Learning Theory , pages=

    Optimal dimension dependence of the Metropolis-adjusted Langevin algorithm , author=. Conference on Learning Theory , pages=. 2021 , organization=

  79. [79]

    Journal of Machine Learning Research , volume=

    Log-concave sampling: Metropolis-Hastings algorithms are fast , author=. Journal of Machine Learning Research , volume=

  80. [80]

    Bernoulli , volume=

    Exact convergence analysis of the independent Metropolis-Hastings algorithms , author=. Bernoulli , volume=. 2022 , publisher=

Showing first 80 references.