Clipped AdamW with exponentially weighted accumulation achieves superior global convergence rates for convex stochastic generalized Lipschitz optimization compared to SGD and AdaGrad.
arXiv preprint arXiv:2409.14989 (2024)
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
math.OC 2verdicts
UNVERDICTED 2representative citing papers
Proposes (L0, L1)-Frank-Wolfe and adaptive variant claiming superior convergence rates for (L0, L1)-smooth objectives over classical Frank-Wolfe.
citing papers explorer
-
Stochastic Non-Smooth Convex Optimization with Unbounded Gradients
Clipped AdamW with exponentially weighted accumulation achieves superior global convergence rates for convex stochastic generalized Lipschitz optimization compared to SGD and AdaGrad.
-
Frank-Wolfe Algorithms for (L0, L1)-smooth functions
Proposes (L0, L1)-Frank-Wolfe and adaptive variant claiming superior convergence rates for (L0, L1)-smooth objectives over classical Frank-Wolfe.