pith. machine review for the scientific record. sign in

arxiv: 1308.6370 · v1 · submitted 2013-08-29 · 🧮 math.OC

Recognition: unknown

Fast Convergence of Stochastic Gradient Descent under a Strong Growth Condition

Authors on Pith no claims yet
classification 🧮 math.OC
keywords gradientconvergencefunctionunderaveragelinearnormrate
0
0 comments X
read the original abstract

We consider optimizing a function smooth convex function $f$ that is the average of a set of differentiable functions $f_i$, under the assumption considered by Solodov [1998] and Tseng [1998] that the norm of each gradient $f_i'$ is bounded by a linear function of the norm of the average gradient $f'$. We show that under these assumptions the basic stochastic gradient method with a sufficiently-small constant step-size has an $O(1/k)$ convergence rate, and has a linear convergence rate if $g$ is strongly-convex.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 6 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Scalable Distributed Stochastic Optimization via Bidirectional Compression: Beyond Pessimistic Limits

    math.OC 2026-05 unverdicted novelty 7.0

    Inkheart SGD and M4 use bidirectional compression to achieve time complexities in distributed SGD that improve with worker count n and surpass prior lower bounds under a necessary structural assumption.

  2. Convergence guarantees for stochastic algorithms solving non-unique problems in metric spaces

    math.OC 2026-05 unverdicted novelty 7.0

    A quantitative theorem supplies uniform rates of convergence for stochastic quasi-Fejér monotone sequences in metric spaces by extending a deterministic regularity notion to the stochastic setting and applying it to p...

  3. Stochastic Trust-Region Methods for Over-parameterized Models

    math.OC 2026-04 unverdicted novelty 7.0

    Stochastic trust-region methods achieve O(ε^{-2} log(1/ε)) complexity for unconstrained problems and O(ε^{-4} log(1/ε)) for equality-constrained problems under the strong growth condition, with experiments showing sta...

  4. Shuffling the Data, Stretching the Step-size: Sharper Bias in constant step-size SGD

    math.OC 2026-04 unverdicted novelty 7.0

    Combining random reshuffling and Richardson-Romberg extrapolation yields cubic bias refinement and better MSE for constant-step SGD on structured non-monotone variational inequalities.

  5. \mathsf{VISTA}: Decentralized Machine Learning in Adversary Dominated Environments

    cs.LG 2026-05 unverdicted novelty 6.0

    VISTA adaptively tunes consistency thresholds in decentralized SGD so that the system converges asymptotically like standard SGD even when adversaries dominate the worker pool.

  6. Rennala MVR: Improved Time Complexity for Parallel Stochastic Optimization via Momentum-Based Variance Reduction

    math.OC 2026-05 unverdicted novelty 5.0

    Rennala MVR improves time complexity over Rennala SGD for smooth nonconvex stochastic optimization in heterogeneous parallel systems under a mean-squared smoothness assumption.