Rapid convergence of tempering chains to multimodal Gibbs measures

Seungjae Son

Authors on Pith no claims yet

Pith reviewed 2026-05-10 19:40 UTC · model grok-4.3

classification 🧮 math.PR math.STstat.TH

keywords spectral gaptempering chainsGibbs measuresmultimodal distributionsMetropolis random walksLyapunov functionsMarkov chain Monte Carloconvergence rates

0 comments

The pith

Parallel and simulated tempering chains achieve polynomial lower bounds on spectral gaps for multimodal Gibbs measures.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that tempering chains built from Metropolis random walks at harmonically spaced temperatures have spectral gaps bounded below by polynomials in the inverse of the low target temperature. The bounds are of order 11 for parallel tempering and 12 for simulated tempering. These results hold for a broad class of potentials and do not require any explicit description of the energy landscape or barrier structure. A reader would care because the polynomial scaling guarantees that the chains mix in polynomial time even when the target distribution has many well-separated modes.

Core claim

We study the spectral gaps of parallel and simulated tempering chains targeting multimodal Gibbs measures. In particular, we consider chains constructed from Metropolis random walks that preserve the Gibbs distributions at a sequence of harmonically spaced temperatures. We prove that their spectral gaps admit polynomial lower bounds of order 11 and 12 in terms of the low target temperature. The analysis applies to a broad class of potentials, beyond mixture models, without requiring explicit structural information on the energy landscape. The main idea is to decompose the state space and construct a Lyapunov function based on a suitably perturbed potential, which allows us to establish lower

What carries the argument

State space decomposition combined with a Lyapunov function constructed from a suitably perturbed potential, used to bound the local spectral gaps of the tempering chains.

Load-bearing premise

The state space can be decomposed and a suitably perturbed potential can be constructed to yield lower bounds on the local spectral gaps for the broad class of potentials considered.

What would settle it

A concrete potential in the considered class for which the spectral gap of either tempering chain decays faster than any polynomial in the inverse low temperature would disprove the claimed bounds.

read the original abstract

We study the spectral gaps of parallel and simulated tempering chains targeting multimodal Gibbs measures. In particular, we consider chains constructed from Metropolis random walks that preserve the Gibbs distributions at a sequence of harmonically spaced temperatures. We prove that their spectral gaps admit polynomial lower bounds of order $11$ and $12$ in terms of the low target temperature. The analysis applies to a broad class of potentials, beyond mixture models, without requiring explicit structural information on the energy landscape. The main idea is to decompose the state space and construct a Lyapunov function based on a suitably perturbed potential, which allows us to establish lower bounds on the local spectral gaps.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 1 minor

Summary. The manuscript proves that the spectral gaps of parallel and simulated tempering chains, built from Metropolis random walks targeting multimodal Gibbs measures at harmonically spaced temperatures, admit polynomial lower bounds of orders 11 and 12 in the low target temperature. The result is claimed to hold for a broad class of potentials without requiring explicit structural information on the energy landscape. The central technique is a state-space decomposition combined with a Lyapunov function constructed from a suitably perturbed potential, which yields lower bounds on the local spectral gaps.

Significance. If the central claims hold, the work supplies rigorous polynomial-in-temperature mixing guarantees for tempering algorithms on multimodal targets. Such bounds are valuable because they are independent of barrier heights and apply beyond the mixture-model setting that dominates much of the existing literature. The use of standard Markov-chain tools (drift conditions via Lyapunov functions) is a methodological strength, and the generality of the potential class strengthens the result's applicability to statistical physics and Bayesian sampling.

minor comments (1)

[Abstract / Introduction] The abstract states that the bounds are of order 11 and 12 but does not immediately indicate which chain (parallel vs. simulated tempering) receives which exponent; a single clarifying sentence in the introduction would improve readability.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive assessment of the manuscript and for recommending acceptance. We are pleased that the referee highlights the value of polynomial spectral gap bounds that hold without explicit energy landscape structure.

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper establishes polynomial lower bounds on spectral gaps via an analytic proof that decomposes the state space and constructs a Lyapunov function from a perturbed potential. This relies on standard Markov chain drift conditions and does not reduce to fitted parameters, self-referential definitions, or load-bearing self-citations. The derivation is self-contained against external mathematical benchmarks and does not invoke any of the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The result rests on standard Markov chain spectral theory plus a domain-specific construction of a state-space decomposition and perturbed potential; no free parameters or new entities are introduced.

axioms (2)

standard math Standard spectral gap theory for reversible Markov chains
Used to relate spectral gap to convergence rate.
domain assumption Existence of a suitable state-space decomposition and perturbed potential yielding local gap bounds
Central technical step stated in the abstract.

pith-pipeline@v0.9.0 · 5392 in / 1186 out tokens · 41668 ms · 2026-05-10T19:40:46.104985+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

42 extracted references · 35 canonical work pages

[1]

write newline

" write newline "" before.all 'output.state := FUNCTION fin.entry add.period write newline FUNCTION new.block output.state before.all = 'skip after.block 'output.state := if FUNCTION new.sentence output.state after.block = 'skip output.state before.all = 'skip after.sentence 'output.state := if if FUNCTION not #0 #1 if FUNCTION and 'skip pop #0 if FUNCTIO...
[2]

Arrhenius

S. Arrhenius. Paper 2 - on the reaction velocity of the inversion of cane sugar by acids††an extract, translated from the german, from an article in zeitschrift für physikalische chemie, 4, 226 (1889). In M. H. BACK and K. J. LAIDLER, editors, Selected Readings in Chemical Kinetics , pages 31--35. Pergamon, 1967. doi:https://doi.org/10.1016/B978-0-08-0123...

work page doi:10.1016/b978-0-08-012344-8.50005-2 1967
[3]

Bakry, F

D. Bakry, F. Barthe, P. Cattiaux, and A. Guillin. A simple proof of the P oincar\'e inequality for a large class of probability measures including the log-concave case. Electron. Commun. Probab. , 13:60--66, 2008. doi:10.1214/ECP.v13-1352

work page doi:10.1214/ecp.v13-1352 2008
[4]

Bovier and F

A. Bovier and F. den Hollander. Metastability , volume 351 of Grundlehren der mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences] . Springer, Cham, 2015. doi:10.1007/978-3-319-24777-9. A potential-theoretic approach

work page doi:10.1007/978-3-319-24777-9 2015
[5]

Brooks, A

S. Brooks, A. Gelman, G. L. Jones, and X.-L. Meng, editors. Handbook of M arkov chain M onte C arlo . Chapman & Hall/CRC Handbooks of Modern Statistical Methods. CRC Press, Boca Raton, FL, 2011. doi:10.1201/b10905

work page doi:10.1201/b10905 2011
[6]

Bovier, V

A. Bovier, V. Gayrard, and M. Klein. Metastability in reversible diffusion processes. II . P recise asymptotics for small eigenvalues. J. Eur. Math. Soc. (JEMS) , 7(1):69--99, 2005. doi:10.4171/JEMS/22

work page doi:10.4171/jems/22 2005
[7]

Bou-Rabee and M

N. Bou-Rabee and M. Hairer. Nonasymptotic mixing of the MALA algorithm. IMA J. Numer. Anal. , 33(1):80--110, 2013. doi:10.1093/imanum/drs003

work page doi:10.1093/imanum/drs003 2013
[8]

Bou-Rabee and E

N. Bou-Rabee and E. Vanden-Eijnden. Pathwise accuracy and ergodicity of metropolized integrators for SDE s. Comm. Pure Appl. Math. , 63(5):655--696, 2010. doi:10.1002/cpa.20306

work page doi:10.1002/cpa.20306 2010
[9]

Cousins and S

B. Cousins and S. Vempala. A cubic algorithm for computing G aussian volume. In Proceedings of the T wenty- F ifth A nnual ACM - SIAM S ymposium on D iscrete A lgorithms , pages 1215--1228. ACM, New York, 2014. doi:10.1137/1.9781611973402.90

work page doi:10.1137/1.9781611973402.90 2014
[10]

doi:10.1214/16-AAP1238 , journal =

A. Durmus and E. Moulines. Nonasymptotic convergence analysis for the unadjusted L angevin algorithm. Ann. Appl. Probab. , 27(3):1551--1587, 2017. doi:10.1214/16-AAP1238

work page doi:10.1214/16-aap1238 2017
[11]

Del Moral, A

P. Del Moral, A. Doucet, and A. Jasra. Sequential M onte C arlo samplers. J. R. Stat. Soc. Ser. B Stat. Methodol. , 68(3):411--436, 2006. doi:10.1111/j.1467-9868.2006.00553.x

work page doi:10.1111/j.1467-9868.2006.00553.x 2006
[12]

Diaconis and L

P. Diaconis and L. Saloff-Coste. Logarithmic S obolev inequalities for finite M arkov chains. Ann. Appl. Probab. , 6(3):695--750, 1996. doi:10.1214/aoap/1034968224

work page doi:10.1214/aoap/1034968224 1996
[13]

Gelman, J

A. Gelman, J. B. Carlin, H. S. Stern, D. B. Dunson, A. Vehtari, and D. B. Rubin. Bayesian data analysis . Texts in Statistical Science Series. CRC Press, Boca Raton, FL, third edition, 2014

2014
[14]

C. J. Geyer. Markov chain monte carlo maximum likelihood. 1991

1991
[15]

R. Ge, H. Lee, and A. Risteski. Beyond log-concavity: Provable guarantees for sampling multi-modal distributions using simulated tempering langevin monte carlo. In S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, editors, Advances in Neural Information Processing Systems , volume 31. Curran Associates, Inc., 2018. ://proc...

2018
[16]

R. Ge, H. Lee, and A. Risteski. Simulated tempering langevin monte carlo ii: An improved proof using soft markov chain decomposition, 2020, 1812.00793 http://arxiv.org/abs/1812.00793 . ://arxiv.org/abs/1812.00793

work page arXiv 2020
[17]

Gilbarg and N

D. Gilbarg and N. S. Trudinger. Elliptic partial differential equations of second order . Classics in Mathematics. Springer-Verlag, Berlin, 2001. Reprint of the 1998 edition

2001
[18]

R. Han, G. Iyer, and D. Slepčev. Time-complexity of sampling from a multimodal distribution using sequential monte carlo, 2026, 2508.02763 http://arxiv.org/abs/2508.02763 . ://arxiv.org/abs/2508.02763

work page arXiv 2026
[19]

Holley and D

R. Holley and D. Stroock. Logarithmic S obolev inequalities and stochastic I sing models. J. Statist. Phys. , 46(5-6):1159--1194, 1987. doi:10.1007/BF01011161

work page doi:10.1007/bf01011161 1987
[20]

Kannan and G

R. Kannan and G. Li. Sampling according to the multivariate normal density. In 37th A nnual S ymposium on F oundations of C omputer S cience ( B urlington, VT , 1996) , pages 204--212. IEEE Comput. Soc. Press, Los Alamitos, CA, 1996. doi:10.1109/SFCS.1996.548479

work page doi:10.1109/sfcs.1996.548479 1996
[21]

V. N. Kolokoltsov. Semiclassical analysis for diffusions and stochastic processes , volume 1724 of Lecture Notes in Mathematics . Springer-Verlag, Berlin, 2000. doi:10.1007/BFb0112488

work page doi:10.1007/bfb0112488 2000
[22]

S. G. Krantz and H. R. Parks. Distance to C k \ hypersurfaces. J. Differential Equations , 40(1):116--120, 1981. doi:10.1016/0022-0396(81)90013-9

work page doi:10.1016/0022-0396(81)90013-9 1981
[23]

W. Krauth. Statistical Mechanics: Algorithms and Computations . Oxford Master Series in Physics. Oxford University Press, 1 edition, 2006

2006
[24]

D. A. Levin and Y. Peres. Markov chains and mixing times . American Mathematical Society, Providence, RI, 2017. doi:10.1090/mbk/107. Second edition of [ MR2466937], With contributions by Elizabeth L. Wilmer, With a chapter on ``Coupling from the past'' by James G. Propp and David B. Wilson

work page doi:10.1090/mbk/107 2017
[25]

Lovász and M

L. Lovász and M. Simonovits. Random walks in a convex body and an improved volume algorithm. Random Structures & Algorithms , 4(4):359--412, 1993, https://onlinelibrary.wiley.com/doi/pdf/10.1002/rsa.3240040402 http://arxiv.org/abs/https://onlinelibrary.wiley.com/doi/pdf/10.1002/rsa.3240040402 . doi:https://doi.org/10.1002/rsa.3240040402

work page doi:10.1002/rsa.3240040402 1993
[26]

Marinari and G

E. Marinari and G. Parisi. Simulated tempering: A new monte carlo scheme. Europhysics Letters , 19(6):451, jul 1992. doi:10.1209/0295-5075/19/6/002

work page doi:10.1209/0295-5075/19/6/002 1992
[27]

Madras and M

N. Madras and M. Piccioni. Importance sampling for families of distributions. Ann. Appl. Probab. , 9(4):1202--1225, 1999. doi:10.1214/aoap/1029962870

work page doi:10.1214/aoap/1029962870 1999
[28]

Madras and D

N. Madras and D. Randall. Markov chain decomposition for convergence rate analysis. Ann. Appl. Probab. , 12(2):581--606, 2002. doi:10.1214/aoap/1026915617

work page doi:10.1214/aoap/1026915617 2002
[29]

Menz and A

G. Menz and A. Schlichting. Poincar\' e and logarithmic S obolev inequalities by decomposition of the energy landscape. Ann. Probab. , 42(5):1809--1884, 2014. doi:10.1214/14-AOP908

work page doi:10.1214/14-aop908 2014
[30]

J. C. Mattingly, A. M. Stuart, and D. J. Higham. Ergodicity for SDE s and approximations: locally L ipschitz vector fields and degenerate noise. Stochastic Process. Appl. , 101(2):185--232, 2002. doi:10.1016/S0304-4149(02)00150-3

work page doi:10.1016/s0304-4149(02)00150-3 2002
[31]

R. M. Neal. Annealed importance sampling. Stat. Comput. , 11(2):125--139, 2001. doi:10.1023/A:1008923215028

work page doi:10.1023/a:1008923215028 2001
[32]

G. A. Pavliotis. Stochastic processes and applications , volume 60 of Texts in Applied Mathematics . Springer, New York, 2014. doi:10.1007/978-1-4939-1323-7. Diffusion processes, the F okker- P lanck and L angevin equations

work page doi:10.1007/978-1-4939-1323-7 2014
[33]

L. Perko. Differential equations and dynamical systems , volume 7 of Texts in Applied Mathematics . Springer-Verlag, New York, third edition, 2001. doi:10.1007/978-1-4613-0003-8

work page doi:10.1007/978-1-4613-0003-8 2001
[34]

C. P. Robert and G. Casella. Monte C arlo statistical methods . Springer Texts in Statistics. Springer-Verlag, New York, 1999. doi:10.1007/978-1-4757-3071-5

work page doi:10.1007/978-1-4757-3071-5 1999
[35]

G. O. Roberts and J. S. Rosenthal. Geometric ergodicity and hybrid M arkov chains. Electron. Comm. Probab. , 2:no. 2, 13--25, 1997. doi:10.1214/ECP.v2-981

work page doi:10.1214/ecp.v2-981 1997
[36]

G. O. Roberts and R. L. Tweedie. Geometric convergence and central limit theorems for multidimensional H astings and M etropolis algorithms. Biometrika , 83(1):95--110, 1996. doi:10.1093/biomet/83.1.95

work page doi:10.1093/biomet/83.1.95 1996
[37]

G. O. Roberts and R. L. Tweedie. Exponential convergence of L angevin distributions and their discrete approximations. Bernoulli , 2(4):341--363, 1996. doi:10.2307/3318418

work page doi:10.2307/3318418 1996
[38]

R. H. Swendsen and J.-S. Wang. Replica monte carlo simulation of spin-glasses. Phys. Rev. Lett. , 57:2607--2609, Nov 1986. doi:10.1103/PhysRevLett.57.2607

work page doi:10.1103/physrevlett.57.2607 1986
[39]

Taghvaei and P

A. Taghvaei and P. G. Mehta. On the lyapunov foster criterion and poincaré inequality for reversible markov chains. IEEE Transactions on Automatic Control , 67(5):2605--2609, 2022. doi:10.1109/TAC.2021.3089643

work page doi:10.1109/tac.2021.3089643 2022
[40]

D. B. Woodard. onditions for rapid and torpid mixing of parallel and simulated tempering on multimodal distribution . Doctoral dissertation, Duke University, 2007

2007
[41]

D. B. Woodard, S. C. Schmidler, and M. Huber. Conditions for rapid mixing of parallel and simulated tempering on multimodal distributions. Ann. Appl. Probab. , 19(2):617--640, 2009. doi:10.1214/08-AAP555

work page doi:10.1214/08-aap555 2009
[42]

Z. Zheng. On swapping and simulated tempering algorithms. Stochastic Process. Appl. , 104(1):131--154, 2003. doi:10.1016/S0304-4149(02)00232-6

work page doi:10.1016/s0304-4149(02)00232-6 2003