OptMuon combines orthogonalized momentum with trajectory-dependent AdaGrad-Norm adaptation to obtain expected-stationarity rates of order T^{-1/2} + sigma^{1/2}T^{-1/4} or T^{-1/2} + sigma^{1/3}T^{-1/3} that reduce to near-optimal deterministic first-order rates in the zero-noise regime.
Meta-storm: Generalized fully-adaptive variance reduced sgd for unbounded functions
2 Pith papers cite this work. Polarity classification is still indexing.
fields
math.OC 2verdicts
UNVERDICTED 2representative citing papers
A proximal stochastic gradient method with variance reduction and adaptive steps is shown to converge strongly at rate O(sqrt(1/k)) for convex composite problems when the smooth term is Lipschitz continuous.
citing papers explorer
-
OptMuon: Closed-Loop Orthogonalized Momentum Methods for Stochastic Optimization with Zero-Noise Optimality
OptMuon combines orthogonalized momentum with trajectory-dependent AdaGrad-Norm adaptation to obtain expected-stationarity rates of order T^{-1/2} + sigma^{1/2}T^{-1/4} or T^{-1/2} + sigma^{1/3}T^{-1/3} that reduce to near-optimal deterministic first-order rates in the zero-noise regime.
-
A Proximal Stochastic Gradient Method with Adaptive Step Size and Variance Reduction for Convex Composite Optimization
A proximal stochastic gradient method with variance reduction and adaptive steps is shown to converge strongly at rate O(sqrt(1/k)) for convex composite problems when the smooth term is Lipschitz continuous.