Establishes that no positive stepsize schedule achieves better than o(n^{-1.334}) anytime convergence for function values or o(n^{-1}) for squared gradient norms in smooth convex optimization.
Acceleratedgradient descent via long steps.arXiv preprint arXiv:2309.09961, 2023
3 Pith papers cite this work. Polarity classification is still indexing.
fields
math.OC 3verdicts
UNVERDICTED 3representative citing papers
A proximal stochastic gradient method with variance reduction and adaptive steps is shown to converge strongly at rate O(sqrt(1/k)) for convex composite problems when the smooth term is Lipschitz continuous.
This expository article introduces stepsize hedging as a way to accelerate gradient descent without additional terms like momentum.
citing papers explorer
-
Lower Bounds for Anytime Acceleration of Gradient Descent
Establishes that no positive stepsize schedule achieves better than o(n^{-1.334}) anytime convergence for function values or o(n^{-1}) for squared gradient norms in smooth convex optimization.
-
A Proximal Stochastic Gradient Method with Adaptive Step Size and Variance Reduction for Convex Composite Optimization
A proximal stochastic gradient method with variance reduction and adaptive steps is shown to converge strongly at rate O(sqrt(1/k)) for convex composite problems when the smooth term is Lipschitz continuous.
-
Stepsize Hedging: an Alternative Mechanism for Accelerating Gradient Descent
This expository article introduces stepsize hedging as a way to accelerate gradient descent without additional terms like momentum.