Mixing times of Langevin dynamics for spiked matrix models
Pith reviewed 2026-05-22 11:05 UTC · model grok-4.3
The pith
Langevin dynamics for spiked Wigner matrices mix in logarithmic time from any initialization symmetric around the top eigenvector, even in the low-temperature regime where worst-case mixing becomes exponential in N.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
In the spherical spike model, Langevin dynamics initialized from a distribution symmetric with respect to the leading eigenvector mix in O(log N) time for all inverse temperatures β, including the regime β > 1/θ where the worst-case mixing time from arbitrary initializations is exponential in N and equals the free-energy difference between the spiked and null measures.
What carries the argument
The spherical spike model on Wigner matrices together with a free-energy comparison that controls metastability between the spiked and null equilibria.
If this is right
- For any initialization invariant under sign flip of the top eigenvector, the chain reaches equilibrium in O(log N) steps even deep in the low-temperature phase.
- The exponential bottleneck for generic initializations arises solely from the free-energy barrier separating the null and spiked phases.
- The critical inverse temperature β_c(θ) = 1/θ marks the point where the two free energies cross.
- Fast mixing from symmetric starts allows polynomial-time sampling from the posterior even when the posterior is multimodal.
Where Pith is reading between the lines
- The same symmetry argument may extend to other spherical priors or to dynamics on the hypercube if the initialization is balanced with respect to the planted direction.
- The free-energy rate formula suggests that similar metastability pictures hold for non-spherical spikes once the corresponding large-deviation rate functions are computed.
Load-bearing premise
The signal-to-noise ratio θ stays large yet bounded away from zero and infinity.
What would settle it
Numerical simulation of the Langevin process for moderate N showing that the observed escape time from the null measure to the spiked measure matches the predicted free-energy difference up to sub-exponential factors.
read the original abstract
We investigate the Langevin dynamics for Wigner matrices with a spherical spike, in the regime where the signal-to-noise ratio $\theta$ is large, but order one. For large, order-$1$, signal-to-noise, the (worst-case) mixing time undergoes a sharp transition around the critical inverse temperature $\beta_c(\theta) = \frac{1}{\theta}$. Namely, if $\beta = \alpha/\theta$, and $\alpha<1$ then at large $\theta$ the mixing time is $O(\log N)$, and if $\alpha>1$ it is exponential in $N$. We show that initialized from the uniform-at-random spherical prior, however, the mixing time in the low-temperature $\alpha>1$ regime circumvents the exponential bottleneck and the mixing time is $O(\log N)$. In fact, this fast mixing holds for any initialization that is symmetric with respect to the top eigenvector of the spiked matrix. Using this, we are able to show a low-temperature metastability picture, pinning down the exact exponential rate of the (worst-case initialization) mixing time for low temperatures, showing it is given by the difference of the free energies of the spiked and null models.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript studies mixing times of spherical Langevin dynamics for Wigner matrices with a spherical rank-one spike in the regime of large but O(1) signal-to-noise ratio θ. It identifies a sharp transition at the critical inverse temperature β_c(θ) = 1/θ: for β = α/θ with α < 1 the worst-case mixing time is O(log N), while for α > 1 it is exponential in N. The paper further shows that initializations symmetric with respect to the top eigenvector (including the uniform spherical prior) achieve O(log N) mixing even for α > 1, and uses this to establish a low-temperature metastability picture in which the exact exponential rate of worst-case mixing equals the difference of free energies between the spiked and null models.
Significance. If the claims hold, the work supplies a precise, initialization-dependent metastability analysis for Langevin dynamics on a canonical high-dimensional inference model. The explicit identification of the free-energy difference as the mixing barrier, together with the fast-mixing result for symmetric initializations, would be a substantive contribution to the literature on sampling and phase transitions in spiked matrix models.
major comments (1)
- [Section deriving fast mixing from symmetric initializations] The O(log N) mixing claim for initializations symmetric with respect to the top eigenvector u is load-bearing for the metastability picture when α > 1. The dynamics is driven by the gradient of x^T A x with A = θ u u^T + W. Reflection R_u through u leaves the spike term invariant but sends W to R_u W R_u, which differs from W with high probability. Since θ remains O(1), the relative perturbation is O(1/θ) and does not vanish with N. The manuscript must clarify, in the section deriving the fast-mixing result for symmetric initializations, whether and how the analysis controls the symmetry-breaking effect of this perturbation on logarithmic timescales.
minor comments (1)
- [Introduction] The abstract states that the exponential rate equals the free-energy difference but does not indicate whether this difference is computed explicitly or left in variational form; a brief statement in the introduction would improve readability.
Simulated Author's Rebuttal
We thank the referee for their thorough review and for identifying this key point regarding the control of symmetry-breaking perturbations in the fast-mixing analysis. We address the comment below and will incorporate a clarification into the revised manuscript.
read point-by-point responses
-
Referee: [Section deriving fast mixing from symmetric initializations] The O(log N) mixing claim for initializations symmetric with respect to the top eigenvector u is load-bearing for the metastability picture when α > 1. The dynamics is driven by the gradient of x^T A x with A = θ u u^T + W. Reflection R_u through u leaves the spike term invariant but sends W to R_u W R_u, which differs from W with high probability. Since θ remains O(1), the relative perturbation is O(1/θ) and does not vanish with N. The manuscript must clarify, in the section deriving the fast-mixing result for symmetric initializations, whether and how the analysis controls the symmetry-breaking effect of this perturbation on logarithmic timescales.
Authors: We appreciate the referee's careful identification of the symmetry-breaking effect arising from the non-commutativity of W with the reflection R_u. In the manuscript, the O(log N) mixing result for symmetric initializations is established by analyzing the evolution of the law under the full dynamics while tracking the discrepancy from exact invariance under R_u. Although the instantaneous difference in the drift vector fields is O(1) (corresponding to a relative perturbation of order 1/θ), the proof controls the accumulated effect on logarithmic timescales through a combination of (i) strong contraction toward the equator in the directions orthogonal to u, with rate independent of the O(1) perturbation, and (ii) explicit error bounds that show the total variation distance to the symmetrized process remains o(1) uniformly up to time C log N. These estimates appear in the coupling argument and the Gronwall-type inequalities used to close the mixing-time bound. We will revise the relevant section to include an explicit paragraph summarizing this control of the perturbation, together with the key estimates that ensure the symmetry-breaking contribution does not affect the O(log N) conclusion. revision: yes
Circularity Check
No circularity: mixing-time claims rest on direct analysis of the spherical Langevin generator and free-energy comparison
full rationale
The derivation proceeds by analyzing the spherical Langevin dynamics driven by the spiked Wigner potential, establishing a sharp transition at β_c(θ)=1/θ via comparison of the associated free energies, and then proving O(log N) mixing from any initialization symmetric with respect to the top eigenvector by direct control of the generator and coupling arguments. These steps are self-contained within the paper's probabilistic estimates and do not reduce to a fitted parameter renamed as a prediction, a self-definitional loop, or a load-bearing self-citation whose validity is assumed rather than independently verified. The symmetry-based fast-mixing result is obtained from the explicit form of the dynamics rather than by construction from the input data or prior fitted quantities.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Standard properties of Wigner matrices with spherical spikes hold in the large but order-one SNR regime
Reference graph
Works this paper leans on
-
[1]
Ahmed El Alaoui, Florent Krzakala, and Michael Jordan. Fundamental limits of detection in the spiked Wigner model.The Annals of Statistics, 48(2):863 – 885, 2020
work page 2020
-
[2]
Symmetric langevin spin glass dynamics.The Annals of Probability, 25(3):1367– 1422, 1997
G Ben Arous and Alice Guionnet. Symmetric langevin spin glass dynamics.The Annals of Probability, 25(3):1367– 1422, 1997
work page 1997
-
[3]
Gerard Ben Arous, Reza Gheissari, and Aukosh Jagannath. Online stochastic gradient descent on non-convex losses from high-dimensional inference.Journal of Machine Learning Research, 22(106):1–51, 2021
work page 2021
-
[4]
Langevin dynamics for high-dimensional optimization: the case of multi-spiked tensor pca, 2024
G´ erard Ben Arous, C´ edric Gerbelot, and Vanessa Piccolo. Langevin dynamics for high-dimensional optimization: the case of multi-spiked tensor pca, 2024
work page 2024
-
[5]
Zhi-Dong Bai and Yong-Qua Yin. Necessary and sufficient conditions for almost sure convergence of the largest eigenvalue of a wigner matrix.The Annals of Probability, pages 1729–1741, 1988
work page 1988
-
[6]
Phase transition of the largest eigenvalue for nonnull complex sample covariance matrices.Ann
Jinho Baik, G´ erard Ben Arous, and Sandrine P´ ech´ e. Phase transition of the largest eigenvalue for nonnull complex sample covariance matrices.Ann. Probab., 33(5):1643–1697, 2005
work page 2005
-
[7]
Spherical spin glass model with external field.Journal of Statistical Physics, 183(2):31, 2021
Jinho Baik, Elizabeth Collins-Woodfin, Pierre Le Doussal, and Hao Wu. Spherical spin glass model with external field.Journal of Statistical Physics, 183(2):31, 2021
work page 2021
-
[8]
G. Ben Arous, A. Dembo, and A. Guionnet. Aging of spherical spin glasses.Probability Theory and Related Fields, 120(1):1–67, 2001
work page 2001
-
[9]
Cugliandolo-Kurchan equations for dynamics of spin- glasses.Probab
G´ erard Ben Arous, Amir Dembo, and Alice Guionnet. Cugliandolo-Kurchan equations for dynamics of spin- glasses.Probab. Theory Related Fields, 136(4):619–660, 2006
work page 2006
-
[10]
G´ erard Ben Arous and Aukosh Jagannath. Spectral gap estimates in mean field spin glasses.Communications in Mathematical Physics, 361(1):1–52, 2018
work page 2018
-
[11]
Stochastic gradient descent in high dimensions for multi-spiked tensor pca, 2025
G´ erard Ben Arous, C´ edric Gerbelot, and Vanessa Piccolo. Stochastic gradient descent in high dimensions for multi-spiked tensor pca, 2025
work page 2025
-
[12]
F. Benaych-Georges, A. Guionnet, and M. Maida. Large deviations of the extreme eigenvalues of random defor- mations of matrices.Probability Theory and Related Fields, 154(3):703–751, Dec 2012
work page 2012
-
[13]
Rank-one matrix estimation: analytic time evolution of gradient descent dynamics
Antoine Bodin and Nicolas Macris. Rank-one matrix estimation: analytic time evolution of gradient descent dynamics. InConference on Learning Theory, pages 635–678. PMLR, 2021
work page 2021
-
[14]
Anton Bovier and Frank den Hollander.Metastability, volume 351 ofGrundlehren der Mathematischen Wis- senschaften [Fundamental Principles of Mathematical Sciences]. Springer, Cham, , 2015. A potential-theoretic approach
work page 2015
-
[15]
A note on the isoperimetric constant
Peter Buser. A note on the isoperimetric constant. InAnnales scientifiques de l’ ´Ecole normale sup´ erieure, vol- ume 15, pages 213–230, 1982
work page 1982
-
[16]
Mireille Capitaine, Catherine Donati-Martin, Delphine F´ eral, et al. The largest eigenvalues of finite rank defor- mation of large wigner matrices: convergence and nonuniversality of the fluctuations.The Annals of Probability, 37(1):1–47, 2009
work page 2009
-
[17]
A lower bound for the smallest eigenvalue of the laplacian
Jeff Cheeger. A lower bound for the smallest eigenvalue of the laplacian. In R. C. Gunning, editor,Problems in Analysis: A Symposium in Honor of Salomon Bochner, pages 195–199. Princeton University Press, Princeton, NJ, 1970
work page 1970
-
[18]
A. Crisanti, H. Horner, and H. J. Sommers. The spherical p-spin interaction spin-glass model.Zeitschrift f¨ ur Physik B Condensed Matter, 92(2):257–271, Jun 1993
work page 1993
-
[19]
Leticia F. Cugliandolo and Jorge Kurchan. Analytical solution of the off-equilibrium dynamics of a long-range spin-glass model.Phys. Rev. Lett., 71:173–176, Jul 1993
work page 1993
-
[20]
L´ aszl´ o Erd˝ os, Horng-Tzer Yau, and Jun Yin. Rigidity of eigenvalues of generalized wigner matrices.Advances in Mathematics, 229(3):1435–1515, 2012
work page 2012
-
[21]
Reza Gheissari and Curtis Grant. Metastability in glauber dynamics for heavy-tailed spin glasses.Communica- tions in Mathematical Physics, 406(4):84, 2025
work page 2025
-
[22]
Reza Gheissari and Aukosh Jagannath. On the spectral gap of spherical spin glass dynamics.Annales de l’Institut Henri Poincar´ e, Probabilit´ es et Statistiques, 55(2):756 – 776, 2019
work page 2019
-
[23]
Local semicircle law under moment conditions. Part I: The Stieltjes transform
Friedrich G¨ otze, Alexey Naumov, and Alexander Tikhomirov. Local semicircle law under moment conditions. part i: The stieltjes transform.arXiv preprint arXiv:1510.07350, 2015. 26 REZA GHEISSARI, CURTIS GRANT, AND TIANMIN YU
work page internal anchor Pith review Pith/arXiv arXiv 2015
-
[24]
Local semicircle law under moment conditions. Part II: Localization and delocalization
Friedrich G¨ otze, Alexey Naumov, and Alexander Tikhomirov. Local semicircle law under moment conditions. part ii: Localization and delocalization.arXiv preprint arXiv:1511.00862, 2015
work page internal anchor Pith review Pith/arXiv arXiv 2015
-
[25]
Brice Huang, Sidhanth Mohanty, Amit Rajaraman, and David X. Wu. Weak poincar´ e inequalities, simulated annealing, and sampling from spherical spin glasses, 2024
work page 2024
-
[26]
Nobuyuki Ikeda and Shinzo Watanabe.Stochastic differential equations and diffusion processes, volume 24. Elsevier, 2014
work page 2014
-
[27]
Department of Statistics, Stanford Uni- versity, 2000
Iain Johnstone.On the distribution of the largest principal component. Department of Statistics, Stanford Uni- versity, 2000
work page 2000
-
[28]
Fundamental limits of symmetric low-rank matrix estimation
Marc Lelarge and L´ eo Miolane. Fundamental limits of symmetric low-rank matrix estimation. InConference on Learning Theory, pages 1297–1301. PMLR, 2017
work page 2017
-
[29]
American Mathematical Soc., , 2017
David A Levin and Yuval Peres.Markov chains and mixing times, volume 107. American Mathematical Soc., , 2017
work page 2017
-
[30]
Tengyuan Liang, Subhabrata Sen, and Pragya Sur. High-dimensional asymptotics of langevin dynamics in spiked matrix models.Information and Inference: A Journal of the IMA, 12(4):2720–2752, 10 2023
work page 2023
-
[31]
Large deviations for the largest eigenvalue of rank one deformations of gaussian ensembles
Myl` ene Maida. Large deviations for the largest eigenvalue of rank one deformations of gaussian ensembles. Electronic Journal of Probability, 12:1131–1150, 2007
work page 2007
-
[32]
A statistical model for tensor pca.Advances in neural information pro- cessing systems, 27, 2014
Andrea Montanari and Emile Richard. A statistical model for tensor pca.Advances in neural information pro- cessing systems, 27, 2014
work page 2014
- [33]
-
[34]
Optimality and Sub-optimality of PCA for Spiked Random Matrices and Synchronization
Amelia Perry, Alexander S Wein, Afonso S Bandeira, and Ankur Moitra. Optimality and sub-optimality of pca for spiked random matrices and synchronization.arXiv preprint arXiv:1609.05573, 2016
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[35]
Dynamic theory of the spin-glass phase.Physical Review Letters, 47(5):359, 1981
Haim Sompolinsky and Annette Zippelius. Dynamic theory of the spin-glass phase.Physical Review Letters, 47(5):359, 1981
work page 1981
-
[36]
Haim Sompolinsky and Annette Zippelius. Relaxational dynamics of the edwards-anderson model and the mean- field theory of spin-glasses.Physical Review B, 25(11):6860, 1982
work page 1982
-
[37]
Tingzhou Yu. Analyzing dynamics and average case complexity in the spherical sherrington-kirkpatrick model: a focus on extreme eigenvectors.arXiv preprint arXiv:2401.03668, 2024. AppendixA.Deferred equilibrium estimates for the spiked matrix model In this section we shall prove Lemmas 4.1 and 4.2, as well as Lemma 1.4. Throughout we will use the shorthand...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.