Local and Global Contraction Principles for MCMC Mixing
Pith reviewed 2026-06-28 08:39 UTC · model grok-4.3
The pith
Gaussian smoothing produces an explicit global contraction coefficient for projected Langevin Monte Carlo under the E_γ-divergence on compact convex domains.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that global contraction coefficients under the E_γ-divergence can be obtained explicitly via Gaussian smoothing for projected Langevin Monte Carlo on compact convex domains, directly proving exponential convergence to the stationary distribution for smooth possibly non-convex potentials, with the rate accommodating random-batch sampling and extending to several divergences; separately, local contraction coefficients on the bounded-weight core control rejection in independent Metropolis-Hastings and yield warm-start convergence bounds even when no moment of order greater than one exists.
What carries the argument
Global and local contraction coefficients of Markov kernels under the E_γ-divergence, with Gaussian smoothing supplying the global coefficient for Langevin dynamics and the bounded-weight core supplying the local coefficient for Metropolis-Hastings.
If this is right
- Explicit exponential convergence rates hold for projected Langevin Monte Carlo with arbitrary smooth possibly non-convex potentials.
- The same rates supply convergence guarantees simultaneously for KL, χ², and Rényi divergences.
- The rates remain valid under any random-batch sampling scheme.
- For independent Metropolis-Hastings, warm-start bounds are controlled by the local contraction coefficient on the core and the tail profile H_R.
- The bounds recover existing sharp moment-based rates whenever a finite moment of order p > 1 exists and continue to apply in heavy-tailed regimes where no such moment exists.
Where Pith is reading between the lines
- The local-versus-global distinction may extend to other sampling kernels whose importance weights are unbounded.
- The explicit form of the contraction coefficient could be used to optimize step sizes or batch sizes in practice.
- The framework might be combined with other divergence analyses to obtain hybrid mixing guarantees.
- Numerical verification of the predicted rates on moderate-dimensional non-convex targets would test whether the constants are sharp.
Load-bearing premise
The target potential must be smooth and the domain must be compact and convex so that Gaussian smoothing produces a well-defined global contraction coefficient under the E_γ-divergence.
What would settle it
A counter-example computation on a smooth non-convex potential over a compact convex domain where the projected Langevin chain fails to exhibit the predicted exponential decay rate in E_γ-divergence to its discretized stationary distribution.
read the original abstract
We develop a contraction-based framework for proving mixing-time bounds for Markov chain Monte Carlo algorithms. The framework is built around global and local contraction coefficients of Markov kernels under the $\mathsf E_\gamma$-divergence with $\gamma\ge1$. For projected Langevin Monte Carlo on a compact convex domain, we show that Gaussian smoothing yields an explicit global contraction coefficient for the $\mathsf E_\gamma$-divergence. This gives a direct proof of exponential convergence to the discretized stationary distribution for general smooth, possibly non-convex potentials. The rate is explicit, accommodates arbitrary random-batch sampling schemes, and yields convergence guarantees for several divergences, including KL, $\chi^2$, and R\'enyi divergences. For independent Metropolis--Hastings with target $\pi$, proposal $q$, and unbounded importance weight $w=d\pi/dq$, global contraction coefficients are typically trivial. We therefore introduce a local contraction coefficient on the core $C_R=\{w\le R\}$ and prove that it controls the rejection profile on the core. This yields warm-start convergence bounds governed by the local contraction coefficient and the tail profile $H_R=\pi(w>R)$, recovering sharp existing moment-based convergence rates when $\mathbb E_q[w^p]<\infty$ for some $p>1$, while remaining effective in heavy-tailed regimes where no finite moment of order $p>1$ exists.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript develops a contraction-based framework for MCMC mixing-time analysis centered on global and local contraction coefficients under the E_γ-divergence (γ ≥ 1). For projected Langevin Monte Carlo on compact convex domains, Gaussian smoothing is shown to produce an explicit global contraction coefficient, yielding exponential convergence to the discretized stationary distribution for smooth, possibly non-convex potentials; the rates are explicit, accommodate arbitrary random-batch sampling, and extend to KL, χ², and Rényi divergences. For independent Metropolis-Hastings, a local contraction coefficient is introduced on the core C_R = {w ≤ R} to control the rejection profile, producing warm-start bounds governed by this coefficient and the tail profile H_R that recover sharp moment-based rates when E_q[w^p] < ∞ for p > 1 and remain effective in heavy-tailed regimes.
Significance. If the derivations hold, the framework supplies explicit, verifiable contraction coefficients and mixing bounds in regimes (non-convex potentials, heavy tails, random batches) where many existing analyses are either implicit or require stronger assumptions. The recovery of known moment-based rates as a special case and the direct extension across multiple divergences are concrete strengths. The introduction of the local contraction coefficient on C_R is a useful technical device that is independent of prior self-citations.
minor comments (3)
- [§2] §2 (notation): the precise definition of the E_γ-divergence and its relation to the standard Rényi family could be stated in one displayed equation for immediate reference.
- [§4.2] §4.2 (local coefficient): an explicit one-line example computing the local contraction coefficient for a simple two-point distribution on C_R would clarify the definition before the general proof.
- [Figure 1] Figure 1 (caption): the caption should explicitly note which divergence is plotted and whether the plotted quantity is the contraction coefficient itself or its logarithm.
Simulated Author's Rebuttal
We thank the referee for the positive summary of our manuscript and the recommendation of minor revision. The referee's description accurately reflects the framework based on global and local contraction coefficients under the E_γ-divergence. No specific major comments were raised in the report.
Circularity Check
No significant circularity
full rationale
The paper defines global and local contraction coefficients for the E_γ-divergence as new quantities and then derives explicit bounds for projected Langevin Monte Carlo and independent Metropolis-Hastings under the stated assumptions (compact convex domain, smooth potentials). These definitions are applied directly to the Markov kernels without any reduction of the claimed rates or convergence statements back to fitted parameters, self-citations, or prior ansatzes by construction. The derivation chain remains self-contained against external benchmarks and does not invoke load-bearing self-citations or uniqueness theorems from the authors' prior work.
Axiom & Free-Parameter Ledger
axioms (2)
- standard math E_γ-divergence satisfies contraction properties under Markov kernels for γ ≥ 1
- domain assumption Gaussian smoothing on compact convex domains produces well-defined kernels for smooth potentials
invented entities (1)
-
local contraction coefficient on core C_R
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Advances in Neural Information Processing Systems , title=
Vempala, Santosh and Wibisono, Andre , journal=. Advances in Neural Information Processing Systems , title=
-
[2]
Fixed Point Theory in Metric Spaces: Recent Advances and Applications , pages=
Banach contraction principle and applications , author=. Fixed Point Theory in Metric Spaces: Recent Advances and Applications , pages=. 2018 , publisher=
2018
-
[3]
Sur les op
Banach, Stefan , journal=. Sur les op. 1922 , publisher=
1922
-
[4]
Liu, Jingbo and Cuff, Paul and Verdú, Sergio , journal=. E_. 2017 , volume=
2017
-
[5]
Resolving the Mixing Time of the
Altschuler, Jason and Talwar, Kunal , booktitle =. Resolving the Mixing Time of the. 2023 , editor =
2023
-
[6]
Stein and Rami Shakarchi , publisher =
Elias M. Stein and Rami Shakarchi , publisher =. Real Analysis: Measure Theory, Integration, and Hilbert Spaces , urldate =
-
[7]
Contraction of
Asoodeh, Shahab and Diaz, Mario and Calmon, Flavio P , journal=. Contraction of
-
[8]
International Conference on Machine Learning , pages=
Improving the gaussian mechanism for differential privacy: Analytical calibration and optimal denoising , author=. International Conference on Machine Learning , pages=. 2018 , organization=
2018
-
[9]
2017 , publisher=
Levin, David A and Peres, Yuval , volume=. 2017 , publisher=
2017
-
[10]
Coffey, William and Kalmykov, Yu P , volume=. The. 2012 , publisher=
2012
-
[11]
Projected stochastic gradient
Lamperski, Andrew , booktitle=. Projected stochastic gradient. 2021 , organization=
2021
-
[12]
Analysis and geometry of
Bakry, Dominique and Gentil, Ivan and Ledoux, Michel and others , volume=. Analysis and geometry of. 2014 , publisher=
2014
-
[13]
Analysis of
Chewi, Sinho and Erdogdu, Murat A and Li, Mufan and Shen, Ruoqi and Zhang, Shunshi , booktitle=. Analysis of. 2022 , organization=
2022
-
[14]
On the convergence of
Erdogdu, Murat A and Hosseinzadeh, Rasa , booktitle=. On the convergence of. 2021 , organization=
2021
-
[15]
Convergence of
Erdogdu, Murat A and Hosseinzadeh, Rasa and Zhang, Shunshi , booktitle=. Convergence of. 2022 , organization=
2022
-
[16]
and He, Ye and Balasubramanian, Krishna and Erdogdu, Murat A
Mousavi-Hosseini, Alireza and Farghly, Tyler K. and He, Ye and Balasubramanian, Krishna and Erdogdu, Murat A. , booktitle =. Towards a Complete Analysis of. 2023 , volume =
2023
-
[17]
Towards a theory of non-log-concave sampling: first-order stationarity guarantees for
Balasubramanian, Krishna and Chewi, Sinho and Erdogdu, Murat A and Salim, Adil and Zhang, Shunshi , booktitle=. Towards a theory of non-log-concave sampling: first-order stationarity guarantees for. 2022 , organization=
2022
-
[18]
Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=
Theoretical guarantees for approximate sampling from smooth and log-concave densities , author=. Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=. 2017 , publisher=
2017
-
[19]
Convergence of
Cheng, Xiang and Bartlett, Peter , booktitle =. Convergence of. 2018 , editor =
2018
-
[20]
Sharp convergence rates for
Cheng, Xiang and Chatterji, Niladri S and Abbasi-Yadkori, Yasin and Bartlett, Peter L and Jordan, Michael I , journal=. Sharp convergence rates for
-
[21]
High-dimensional Bayesian inference via the Unadjusted
Durmus, Alain and Moulines, Eric , journal=. High-dimensional Bayesian inference via the Unadjusted
-
[22]
User-friendly guarantees for the
Dalalyan, Arnak S and Karagulyan, Avetik , journal=. User-friendly guarantees for the. 2019 , publisher=
2019
-
[23]
Analysis of
Durmus, Alain and Majewski, Szymon and Miasojedow, B. Analysis of. Journal of Machine Learning Research , volume=
-
[24]
Unadjusted
Nguyen, Dao and Dang, Xin and Chen, Yixin , journal=. Unadjusted. 2023 , publisher=
2023
-
[25]
Proceedings of the National Academy of Sciences , volume=
Sampling can be faster than optimization , author=. Proceedings of the National Academy of Sciences , volume=. 2019 , publisher=
2019
-
[26]
Sampling from a log-concave distribution with projected
Bubeck, S. Sampling from a log-concave distribution with projected. Discrete & Computational Geometry , volume=. 2018 , publisher=
2018
-
[27]
Bayesian learning via stochastic gradient
Welling, Max and Teh, Yee W , booktitle=. Bayesian learning via stochastic gradient. 2011 , organization=
2011
-
[28]
Proceedings of the 2016 ACM SIGSAC conference on computer and communications security , pages=
Deep learning with differential privacy , author=. Proceedings of the 2016 ACM SIGSAC conference on computer and communications security , pages=
2016
-
[29]
arXiv preprint arXiv:2305.09903 , year=
Privacy Loss of Noisy Stochastic Gradient Descent Might Converge Even for Non-Convex Losses , author=. arXiv preprint arXiv:2305.09903 , year=
-
[30]
Faster differentially private samplers via
Ganesh, Arun and Talwar, Kunal , journal=. Faster differentially private samplers via
-
[31]
Lehec, Joseph , journal=. The. 2023 , publisher=
2023
-
[32]
Global Convergence of
Xu, Pan and Chen, Jinghui and Zou, Difan and Gu, Quanquan , booktitle =. Global Convergence of
-
[33]
Non-convex learning via stochastic gradient
Raginsky, Maxim and Rakhlin, Alexander and Telgarsky, Matus , booktitle=. Non-convex learning via stochastic gradient. 2017 , organization=
2017
-
[34]
The Annals of Applied Probability , volume=
NONASYMPTOTIC BOUNDS FOR SAMPLING ALGORITHMS WITHOUT LOG-CONCAVITY , author=. The Annals of Applied Probability , volume=. 2020 , publisher=
2020
-
[35]
Fast conditional mixing of
Cheng, Xiang and Wang, Bohan and Zhang, Jingzhao and Zhu, Yusong , journal=. Fast conditional mixing of
-
[36]
Improved bounds for discretization of
Mou, Wenlong and Flammarion, Nicolas and Wainwright, Martin J and Bartlett, Peter L , journal=. Improved bounds for discretization of. 2022 , publisher=
2022
-
[37]
Constrained
Zheng, Yuping and Lamperski, Andrew , booktitle =. Constrained
-
[38]
1991 , publisher=
Probability with martingales , author=. 1991 , publisher=
1991
-
[39]
Concentration of the
Altschuler, Jason M and Talwar, Kunal , journal=. Concentration of the
-
[40]
Optimal scaling of discrete approximations to
Roberts, Gareth O and Rosenthal, Jeffrey S , journal=. Optimal scaling of discrete approximations to
-
[41]
Nonconvex sampling with the
Mangoubi, Oren and Vishnoi, Nisheeth K , booktitle =. Nonconvex sampling with the. 2019 , editor =
2019
-
[42]
Beyond Log-concavity: Provable Guarantees for Sampling Multi-modal Distributions using Simulated Tempering
Lee, Holden and Risteski, Andrej and Ge, Rong , booktitle =. Beyond Log-concavity: Provable Guarantees for Sampling Multi-modal Distributions using Simulated Tempering
-
[43]
Stochastic gradient and
Cheng, Xiang and Yin, Dong and Bartlett, Peter and Jordan, Michael , booktitle=. Stochastic gradient and. 2020 , organization=
2020
-
[44]
On stochastic gradient
Chau, Ngoc Huy and Moulines,. On stochastic gradient. SIAM Journal on Mathematics of Data Science , volume=. 2021 , publisher=
2021
-
[45]
Efficient constrained sampling via the mirror-
Ahn, Kwangjun and Chewi, Sinho , journal=. Efficient constrained sampling via the mirror-
-
[46]
Underdamped
Cheng, Xiang and Chatterji, Niladri S and Bartlett, Peter L and Jordan, Michael I , booktitle=. Underdamped. 2018 , organization=
2018
-
[47]
On sampling from a log-concave density using kinetic
Dalalyan, Arnak S and Riou-Durand, Lionel , journal=. On sampling from a log-concave density using kinetic
-
[48]
Bounding the error of discretized
Dalalyan, Arnak S and Karagulyan, Avetik and Riou-Durand, Lionel , journal=. Bounding the error of discretized
-
[49]
Journal of the ACM , volume=
Faster high-accuracy log-concave sampling via algorithmic warm starts , author=. Journal of the ACM , volume=. 2024 , publisher=
2024
-
[50]
Constrained Sampling with Primal-Dual
Chamon, Luiz FO and Karimi, Mohammad Reza and Korba, Anna , journal=. Constrained Sampling with Primal-Dual
-
[51]
Exponential Convergence of
Roberts, Gareth O and Tweedie, Richard L , journal=. Exponential Convergence of. 1996 , publisher=
1996
-
[52]
f -Divergence Inequalities , year=
Sason, Igal and Verdú, Sergio , journal=. f -Divergence Inequalities , year=
-
[53]
Fast Convergence of -Divergence Along the Unadjusted
Siddharth Mitra and Andre Wibisono , booktitle=. Fast Convergence of -Divergence Along the Unadjusted
-
[54]
On Independent Samples Along the
Liang, Jiaming and Mitra, Siddharth and Wibisono, Andre , journal=. On Independent Samples Along the
-
[55]
Book draft available at
Log-concave sampling , author=. Book draft available at
-
[56]
Optimizing the diffusion coefficient of overdamped
Leli. Optimizing the diffusion of overdamped. arXiv preprint arXiv:2404.12087 , year=
-
[57]
Shifted composition I:
Altschuler, Jason M and Chewi, Sinho , journal=. Shifted composition I:. 2024 , publisher=
2024
-
[58]
Shifted composition II: shift
Altschuler, Jason M and Chewi, Sinho , journal=. Shifted composition II: shift
-
[59]
Shifted Composition III: Local Error Framework for
Altschuler, Jason M and Chewi, Sinho , journal=. Shifted Composition III: Local Error Framework for
-
[60]
2018 IEEE 59th Annual Symposium on Foundations of Computer Science (FOCS) , pages=
Privacy amplification by iteration , author=. 2018 IEEE 59th Annual Symposium on Foundations of Computer Science (FOCS) , pages=. 2018 , organization=
2018
-
[61]
Advances in Neural Information Processing Systems , volume=
Privacy of noisy stochastic gradient descent: More iterations without more privacy loss , author=. Advances in Neural Information Processing Systems , volume=
-
[62]
and Özgür, Ayfer , booktitle=
Asoodeh, Shahab and Chen, Wei-Ning and Calmon, Flavio P. and Özgür, Ayfer , booktitle=. Differentially Private Federated Learning: An Information-Theoretic Perspective , year=
-
[63]
, booktitle=
Asoodeh, Shahab and Diaz, Mario and Calmon, Flavio P. , booktitle=. Privacy Amplification of Iterative Algorithms via Contraction Coefficients , year=
-
[64]
Supplementary Material for Exponential Convergence of Projected
Daeijavad, Alireza and Asoodeh, Shahab , year =. Supplementary Material for Exponential Convergence of Projected
-
[65]
Makur, A. and Zheng, L. , title =. 2020 , issue_date =. doi:10.1134/S0032946020020015 , journal =
-
[66]
IEEE Transactions on Information Theory , volume=
Strong data processing inequalities and -Sobolev inequalities for discrete channels , author=. IEEE Transactions on Information Theory , volume=. 2016 , publisher=
2016
-
[67]
Zamanlooy and S
B. Zamanlooy and S. Asoodeh and M. Diaz and Flavio Calmon , journal=. 2024 , pages=
2024
-
[68]
1998 , publisher=
Comparisons of Stochastic Matrices, with Applications in Information Theory, Statistics, Economics, and Population Sciences , author=. 1998 , publisher=
1998
-
[69]
2014 , isbn =
Nesterov, Yurii , title =. 2014 , isbn =
2014
-
[70]
Projected
Pang, Chenxu and Wang, Xiaojie and Wu, Yue , journal=. Projected. 2025 , publisher=
2025
-
[71]
Stochastic Processes and their Applications , volume=
The tamed unadjusted Langevin algorithm , author=. Stochastic Processes and their Applications , volume=. 2019 , publisher=
2019
-
[72]
Algorithmic Learning Theory , pages=
Fast Convergence of -Divergence Along the Unadjusted Langevin Algorithm and Proximal Sampler , author=. Algorithmic Learning Theory , pages=. 2025 , organization=
2025
-
[73]
The Journal of Chemical Physics , author =
Metropolis, Nicholas and Rosenbluth, Arianna W. and Rosenbluth, Marshall N. and Teller, Augusta H. and Teller, Edward , title =. The Journal of Chemical Physics , volume =. 1953 , month =. doi:10.1063/1.1699114 , url =
-
[74]
W. K. Hastings , journal =
-
[75]
K. L. Mengersen and R. L. Tweedie , title =. The Annals of Statistics , number =. 1996 , doi =
1996
-
[76]
Biometrika , volume=
Geometric convergence and central limit theorems for multidimensional Hastings and Metropolis algorithms , author=. Biometrika , volume=. 1996 , publisher=
1996
-
[77]
G. O. Roberts and A. Gelman and W. R. Gilks , title =. The Annals of Applied Probability , number =. 1997 , doi =
1997
-
[78]
Conference on Learning Theory , pages=
Optimal dimension dependence of the Metropolis-adjusted Langevin algorithm , author=. Conference on Learning Theory , pages=. 2021 , organization=
2021
-
[79]
Journal of Machine Learning Research , volume=
Log-concave sampling: Metropolis-Hastings algorithms are fast , author=. Journal of Machine Learning Research , volume=
-
[80]
Bernoulli , volume=
Exact convergence analysis of the independent Metropolis-Hastings algorithms , author=. Bernoulli , volume=. 2022 , publisher=
2022
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.