Develops constant-stepsize and auto-conditioned projected gradient methods plus stochastic variants that achieve new iteration complexity bounds for finding approximate stationary points in nonconvex smooth optimization.
arXiv preprint arXiv:2401.08024 , year=
3 Pith papers cite this work. Polarity classification is still indexing.
fields
math.OC 3verdicts
UNVERDICTED 3representative citing papers
OptMuon combines orthogonalized momentum with trajectory-dependent AdaGrad-Norm adaptation to obtain expected-stationarity rates of order T^{-1/2} + sigma^{1/2}T^{-1/4} or T^{-1/2} + sigma^{1/3}T^{-1/3} that reduce to near-optimal deterministic first-order rates in the zero-noise regime.
A proximal stochastic gradient method with variance reduction and adaptive steps is shown to converge strongly at rate O(sqrt(1/k)) for convex composite problems when the smooth term is Lipschitz continuous.
citing papers explorer
-
Projected gradient methods for nonconvex and stochastic smooth optimization: new complexities and auto-conditioned stepsizes
Develops constant-stepsize and auto-conditioned projected gradient methods plus stochastic variants that achieve new iteration complexity bounds for finding approximate stationary points in nonconvex smooth optimization.
-
OptMuon: Closed-Loop Orthogonalized Momentum Methods for Stochastic Optimization with Zero-Noise Optimality
OptMuon combines orthogonalized momentum with trajectory-dependent AdaGrad-Norm adaptation to obtain expected-stationarity rates of order T^{-1/2} + sigma^{1/2}T^{-1/4} or T^{-1/2} + sigma^{1/3}T^{-1/3} that reduce to near-optimal deterministic first-order rates in the zero-noise regime.
-
A Proximal Stochastic Gradient Method with Adaptive Step Size and Variance Reduction for Convex Composite Optimization
A proximal stochastic gradient method with variance reduction and adaptive steps is shown to converge strongly at rate O(sqrt(1/k)) for convex composite problems when the smooth term is Lipschitz continuous.