SGD's stationary distribution is Boltzmann-Gibbs with temperature equal to step-size, concentrating exponentially on minimum-energy critical points.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
verdicts
UNVERDICTED 2representative citing papers
New RSLMC sampling algorithms achieve uniform-in-time W2 error bounds of order O(sqrt(d) h) under gradient Lipschitz and log-Sobolev assumptions, with modified versions for superlinear gradient growth and supporting numerical examples.
citing papers explorer
-
What is the long-run distribution of stochastic gradient descent? A large deviations analysis
SGD's stationary distribution is Boltzmann-Gibbs with temperature equal to step-size, concentrating exponentially on minimum-energy critical points.
-
When Langevin Monte Carlo Meets Randomization: New Sampling Algorithms with Non-asymptotic Error Bounds beyond Log-Concavity and Gradient Lipschitzness
New RSLMC sampling algorithms achieve uniform-in-time W2 error bounds of order O(sqrt(d) h) under gradient Lipschitz and log-Sobolev assumptions, with modified versions for superlinear gradient growth and supporting numerical examples.