Recognition: unknown
Convergence Rate of Frank-Wolfe for Non-Convex Objectives
read the original abstract
We give a simple proof that the Frank-Wolfe algorithm obtains a stationary point at a rate of $O(1/\sqrt{t})$ on non-convex objectives with a Lipschitz continuous gradient. Our analysis is affine invariant and is the first, to the best of our knowledge, giving a similar rate to what was already proven for projected gradient methods (though on slightly different measures of stationarity).
This paper has not been read by Pith yet.
Forward citations
Cited by 4 Pith papers
-
Fused Gromov-Wasserstein Distance with Feature Selection
Fused Gromov-Wasserstein distances are extended with feature selection via Lasso/Ridge regularization or simplex-constrained weights, yielding theoretical bounds, metric properties, and an alternating minimization algorithm.
-
Accelerating LMO-Based Optimization via Implicit Gradient Transport
LMO-IGT achieves O(ε^{-3.5}) iteration complexity for stochastic LMO optimization via implicit gradient transport with a single gradient per step and introduces the regularized support function as a unified stationari...
-
Scalable First-Order Interior Point Trust Region Algorithms for Linearly Constrained Optimization
An approximate IPTR framework for linearly constrained optimization uses low-rank projector updates to cut per-iteration cost while preserving feasibility and convergence guarantees, with experiments showing 2.48x speedup.
-
Model Merging: Foundations and Algorithms
New cycle-consistent optimization, task vector theory, singular vector decompositions, adaptive routing, and efficient evolutionary search provide foundations for merging neural network weights across tasks.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.