pith. sign in

arxiv: 2606.21354 · v1 · pith:EQALOI3Dnew · submitted 2026-06-19 · 🧮 math.OC

Restart and Adaptive Acceleration in Stochastic Gradient Methods

Pith reviewed 2026-06-26 13:58 UTC · model grok-4.3

classification 🧮 math.OC
keywords restart schemesstochastic gradient descentKurdyka-Łojasiewicz inequalityweakly convex optimizationaccelerationadaptive methodsnon-smooth stochastic optimization
0
0 comments X

The pith

Restart schemes let stochastic gradient methods on weakly convex problems achieve faster convergence by using the Kurdyka-Łojasiewicz exponent.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper studies restart techniques applied to stochastic optimization problems whose objectives are non-smooth and weakly convex yet obey a Kurdyka-Łojasiewicz inequality. It shows that periodic restarts make it possible to exploit the inequality and obtain convergence rates whose improvement depends directly on the value of the KL exponent. Optimal choices of restart intervals produce step sizes that behave like Polyak steps for SGD. The resulting schemes stay effective even when the constants appearing in the inequality are known only approximately, rendering the method nearly adaptive. Numerical tests are provided on toy examples with known exponents and on the training of large language models.

Core claim

In the non-smooth weakly convex stochastic setting, restart schemes allow the Kurdyka-Łojasiewicz inequality to be leveraged so that convergence rates improve explicitly with the KL exponent; optimal restart schedules recover learning rates akin to Polyak steps, and the schemes remain robust to substantial misspecification of the regularity constants.

What carries the argument

Restart schemes that periodically reset the iterate to exploit the Kurdyka-Łojasiewicz inequality for accelerated rates in stochastic gradient methods.

If this is right

  • Convergence rates improve explicitly with the value of the KL exponent.
  • Optimal restart intervals produce step sizes comparable to Polyak steps for SGD.
  • The schemes remain effective under significant errors in the estimated KL constants.
  • The approach applies directly to training large language models.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same restart logic might be tested on other regularity conditions that replace the KL inequality.
  • Approximate estimation of the exponent could be combined with online restart adjustment in large-scale training.
  • Robustness to misspecification suggests restarts may serve as a practical substitute for exact knowledge of curvature parameters.

Load-bearing premise

The objective functions satisfy a Kurdyka-Łojasiewicz inequality with some exponent in the non-smooth weakly convex stochastic setting.

What would settle it

A controlled experiment on a function known to satisfy the KL inequality in which the observed convergence rate after restarts fails to improve with the exponent or collapses under moderate misspecification of the constants.

Figures

Figures reproduced from arXiv: 2606.21354 by Alexandre d'Aspremont, Ali Elhishi, Chistophe Roux.

Figure 1
Figure 1. Figure 1: Adaptation to the number of restarts τN∗ : optimal bound, second order expansion and τ 1/(4θ−1) . 4.3. Robustness with respect to learning rate misspecification. In this section, we establish the complexity bound on the primal gap, in the case where the learning rate is misspecified with respect to the definition in (9): we consider a factor ζ > 0 such that the step size becomes αˆ = ζ · (f1/(2ρ) (xk) − f … view at source ↗
Figure 2
Figure 2. Figure 2: Moreau envelope gap for the best exponential restart schedule using the learning rate defined in (9). The dashed lines correspond to the scaled plot T 7→ T −1/(4θ−1), i.e. the theoretical rates. Left: objectives of the form f1(x) = ∥x − a∥ q for several values of q. Right: objectives of the form Fp,τ (u, z) = 2τ (|u| − u 2/4 + ∥z∥ p/p) + ιC(u, z) for τ = 1 and several values of p [PITH_FULL_IMAGE:figures/… view at source ↗
Figure 3
Figure 3. Figure 3: Final optimality gap for the Moreau envelope after running several con￾stant restart schemes. Left: objective of the form f2(x) = P i ∥x − ai∥ 2 , Right: objective of the form F(u, z) = 2(|u| − u 2/4 + ∥z∥ 2.5/2.5) + ιC(u, z) [PITH_FULL_IMAGE:figures/full_fig_p019_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Final optimality gap for the Moreau envelope, after running several constant and exponential restart schemes. The x-axis indicates the first inner loop’s length, which would stay the same in the constant restart scheme, and increase P exponentially in the exponential restart scheme. Left: objective of the form f2(x) = i ∥x − ai∥ 2 , Right: objective of the form F(u, z) = 2(|u| − u 2/4 + ∥z∥ 2.5/2.5) + ιC(u… view at source ↗
Figure 5
Figure 5. Figure 5: Final training and validation loss as a function of the number of restarts for nanoGPT trained on the works of Shakespeare. The parameter a is a multiplica￾tive stepsize factor, which acts as an empirical estimate of the unknown theoretical scaling. Figures 5 and 6 show the final loss as a function of the number of restarts. In both experiments, the restart frequency has a visible effect on performance, an… view at source ↗
Figure 6
Figure 6. Figure 6: Final training and validation loss as a function of the number of restarts for ResNet18 trained on CIFAR10. Here a is a multiplicative stepsize factor, which acts as an empirical estimate of the unknown theoretical scaling. Bolte, J., Daniilidis, A. & Lewis, A. (2007), ‘The Lojasiewicz inequality for nonsmooth subanalytic functions with applications to subgradient dynamical systems’, SIAM Journal on Optimi… view at source ↗
read the original abstract

We study restart schemes in stochastic optimization problems for non-smooth and weakly convex that satisfy a Kurdyka-\L ojasiewicz inequality. We show that using restarts allows us to leverage the K{\L} inequalities to achieve improved rates of convergence, with acceleration depending explicitly on the K{\L} exponent. Furthermore, optimal restart schedules lead to learning-rates akin to Polyak steps for SGD. While regularity constants such as the K{\L} exponent are typically unknown in practice, we prove that restart schemes are robust to a significant misspecification of these constants, hence nearly adaptive. We detail numerical experiments on both toy problems, where the K{\L} exponent is controlled, and training of Large Language Models (LLMs).

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 3 minor

Summary. The manuscript studies restart schemes for stochastic gradient methods in non-smooth weakly convex optimization problems that satisfy a Kurdyka-Łojasiewicz (KL) inequality. It claims that restarts enable leveraging the KL property to obtain improved convergence rates depending explicitly on the KL exponent, that optimal restart schedules produce learning rates akin to Polyak steps for SGD, and that the schemes remain robust to significant misspecification of the KL constants (hence nearly adaptive). Supporting numerical experiments are presented on toy problems with controlled KL exponents and on LLM training.

Significance. If the central claims hold, the work provides a theoretically grounded mechanism for acceleration in a broad class of stochastic problems via restarts, with explicit dependence on problem geometry through the KL exponent and demonstrated robustness to unknown constants. The combination of analysis for the non-smooth weakly convex stochastic setting and experiments reaching LLM training indicates potential relevance for practical machine-learning optimization.

minor comments (3)
  1. Notation for the KL exponent and related constants should be made uniform between the abstract, §2 (problem setting), and the statements of the main theorems to avoid reader confusion.
  2. The description of the stochastic noise model and how it is absorbed into the KL-based analysis (mentioned in the abstract and §3) would benefit from an explicit remark on the required moment assumptions.
  3. In the experimental section, the toy-problem figures should include error bars or multiple runs to illustrate variability, consistent with the LLM-training plots.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive assessment of the manuscript, the recognition of its potential relevance, and the recommendation of minor revision. No specific major comments are listed in the report, so we have no individual points requiring detailed rebuttal at this stage. We remain available to incorporate any minor suggestions or clarifications that may arise.

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper states the Kurdyka-Łojasiewicz inequality (with explicit exponent) upfront as the defining property of the problem class rather than deriving it internally. All central claims—improved rates via restarts depending on the KL exponent, Polyak-like steps from optimal restart schedules, and robustness to misspecification—are derived analytically from this stated assumption in the non-smooth weakly convex stochastic setting. No load-bearing steps reduce by construction to fitted parameters, self-definitions, or self-citation chains; the derivation chain remains self-contained against the given regularity conditions.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claims rest on the assumption that the objective satisfies a KL inequality with a fixed but possibly unknown exponent; this is a domain assumption imported from prior optimization literature rather than derived here. No free parameters are fitted inside the derivation itself, and no new entities are postulated.

axioms (1)
  • domain assumption The objective function satisfies a Kurdyka-Łojasiewicz inequality with some exponent θ in the non-smooth weakly convex stochastic setting.
    Invoked throughout the abstract to obtain the accelerated rates and robustness; without this the restart analysis does not apply.

pith-pipeline@v0.9.1-grok · 5650 in / 1357 out tokens · 20327 ms · 2026-06-26T13:58:42.469380+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

300 extracted references · 3 canonical work pages

  1. [1]

    IMA Journal of Numerical Analysis , volume =

    Davis, Damek and Drusvyatskiy, Dmitriy and Paquette, Courtney , title =. IMA Journal of Numerical Analysis , volume =. 2020 , month =. doi:10.1093/imanum/drz031 , url =

  2. [2]

    Restarted Nonconvex Accelerated Gradient Descent: No More Polylogarithmic Factor in the in the O (epsilon\^

    Li, Huan and Lin, Zhouchen , date-added =. Restarted Nonconvex Accelerated Gradient Descent: No More Polylogarithmic Factor in the in the O (epsilon\^. Journal of Machine Learning Research , number =

  3. [3]

    arXiv preprint arXiv:1608.03983 , title =

    Loshchilov, Ilya and Hutter, Frank , date-added =. arXiv preprint arXiv:1608.03983 , title =

  4. [4]

    Adaptive restart of the optimized gradient method for convex optimization , volume =

    Kim, Donghwan and Fessler, Jeffrey A , date-added =. Adaptive restart of the optimized gradient method for convex optimization , volume =. Journal of Optimization Theory and Applications , number =

  5. [5]

    Scheduled restart momentum for accelerated stochastic gradient descent , volume =

    Wang, Bao and Nguyen, Tan and Sun, Tao and Bertozzi, Andrea L and Baraniuk, Richard G and Osher, Stanley J , date-added =. Scheduled restart momentum for accelerated stochastic gradient descent , volume =. SIAM Journal on Imaging Sciences , number =

  6. [6]

    arXiv preprint arXiv:2002.11582 , title =

    Zhou, Yi and Wang, Zhe and Ji, Kaiyi and Liang, Yingbin and Tarokh, Vahid , date-added =. arXiv preprint arXiv:2002.11582 , title =

  7. [7]

    Rsg: Beating subgradient method without smoothness and strong convexity , volume =

    Yang, Tianbao and Lin, Qihang , date-added =. Rsg: Beating subgradient method without smoothness and strong convexity , volume =. Journal of Machine Learning Research , number =

  8. [8]

    Kurdyka--

    Yu, Peiran and Li, Guoyin and Pong, Ting Kei , date-added =. Kurdyka--. Foundations of Computational Mathematics , number =

  9. [9]

    Learning with gradient descent and weakly convex losses , year =

    Richards, Dominic and Rabbat, Mike , booktitle =. Learning with gradient descent and weakly convex losses , year =

  10. [10]

    Subgradient methods under weak convexity and tame geometry , volume =

    Davis, D and Drusvyatskiy, D , date-added =. Subgradient methods under weak convexity and tame geometry , volume =. SIAG/OPT Views and News , number =

  11. [11]

    From error bounds to the complexity of first-order descent methods for convex functions , volume =

    Bolte, J. From error bounds to the complexity of first-order descent methods for convex functions , volume =. Mathematical Programming , number =

  12. [12]

    Convergence of descent methods for semi-algebraic and tame problems: proximal algorithms, forward--backward splitting, and regularized Gauss--Seidel methods , volume =

    Attouch, Hedy and Bolte, J. Convergence of descent methods for semi-algebraic and tame problems: proximal algorithms, forward--backward splitting, and regularized Gauss--Seidel methods , volume =. Mathematical programming , number =

  13. [13]

    Sharp analysis of stochastic optimization under global

    Fatkhullin, Ilyas and Etesami, Jalal and He, Niao and Kiyavash, Negar , date-added =. Sharp analysis of stochastic optimization under global. Advances in Neural Information Processing Systems , pages =

  14. [14]

    Convergence rates and approximation results for SGD and its continuous-time counterpart , year =

    Fontaine, Xavier and De Bortoli, Valentin and Durmus, Alain , booktitle =. Convergence rates and approximation results for SGD and its continuous-time counterpart , year =

  15. [15]

    Efficiency of minimizing compositions of convex functions and smooth maps , volume =

    Drusvyatskiy, Dmitriy and Paquette, Courtney , date-added =. Efficiency of minimizing compositions of convex functions and smooth maps , volume =. Mathematical Programming , number =

  16. [16]

    arXiv preprint arXiv:1712.06038 , title =

    Drusvyatskiy, Dmitriy , date-added =. arXiv preprint arXiv:1712.06038 , title =

  17. [17]

    arXiv preprint arXiv:1802.02988 , title =

    Davis, Damek and Drusvyatskiy, Dmitriy , date-added =. arXiv preprint arXiv:1802.02988 , title =

  18. [18]

    arXiv preprint arXiv:2503.12645 , title =

    Kovalev, Dmitry , date-added =. arXiv preprint arXiv:2503.12645 , title =

  19. [19]

    arXiv preprint arXiv:2505.13416 , title =

    Riabinin, Artem and Shulgin, Egor and Gruntkowska, Kaja and Richt. arXiv preprint arXiv:2505.13416 , title =

  20. [20]

    PEPit: computer-assisted worst-case analyses of first-order optimization methods in Python , volume =

    Goujaud, Baptiste and Moucer, C. PEPit: computer-assisted worst-case analyses of first-order optimization methods in Python , volume =. Mathematical Programming Computation , number =

  21. [21]

    Automated tight Lyapunov analysis for first-order methods , volume =

    Upadhyaya, Manu and Banert, Sebastian and Taylor, Adrien B and Giselsson, Pontus , date-added =. Automated tight Lyapunov analysis for first-order methods , volume =. Mathematical Programming , number =

  22. [22]

    arXiv preprint arXiv:2406.18282 , title =

    Dubois-Taine, Benjamin and d'Aspremont, Alexandre , date-added =. arXiv preprint arXiv:2406.18282 , title =

  23. [23]

    Birthday paradox, coupon collectors, caching algorithms and self-organizing search , volume =

    Flajolet, Philippe and Gardy, Daniele and Thimonier, Lo. Birthday paradox, coupon collectors, caching algorithms and self-organizing search , volume =. Discrete Applied Mathematics , number =

  24. [24]

    Global tracking and quantification of oil and gas methane emissions from recurrent sentinel-2 imagery , volume =

    Ehret, Thibaud and De Truchis, Aur. Global tracking and quantification of oil and gas methane emissions from recurrent sentinel-2 imagery , volume =. Environmental science & technology , number =

  25. [25]

    Digital twinning of all forest and non-forest trees at national level via deep learning , year =

    Li, Sizhuo and Brandt, Martin and Fensholt, Rasmus and Kariryaa, Ankit and Igel, Christian and Gieseke, Fabian and Nord-Larsen, Thomas and Oehmke, Stefan and Holm-Carlsen, Ask and Junttila, Samuli and others , date-added =. Digital twinning of all forest and non-forest trees at national level via deep learning , year =

  26. [26]

    High resolution assessment of coal mining methane emissions by satellite in Shanxi, China , year =

    Peng, Shushi and Giron, Cl. High resolution assessment of coal mining methane emissions by satellite in Shanxi, China , year =

  27. [27]

    Stable bounds on the duality gap of separable nonconvex optimization problems , volume =

    Kerdreux, Thomas and Colin, Igor and d'Aspremont, Alexandre , date-added =. Stable bounds on the duality gap of separable nonconvex optimization problems , volume =. Mathematics of Operations Research , number =

  28. [28]

    arXiv preprint arXiv:2306.17470 , title =

    Lezane, Cl. arXiv preprint arXiv:2306.17470 , title =

  29. [29]

    , bibtex_show =

    d'Aspremont, Alexandre and Scieur, Damien and Taylor, Adrien B. , bibtex_show =. Acceleration Methods , volume =. Foundations and Trends in Optimization , number =

  30. [30]

    Performance estimation toolbox (PESTO): Automated worst-case analysis of first-order optimization methods , year =

    Taylor, Adrien B and Hendrickx, Julien M and Glineur, Fran. Performance estimation toolbox (PESTO): Automated worst-case analysis of first-order optimization methods , year =. 2017 IEEE 56th Annual Conference on Decision and Control (CDC) , date-added =

  31. [31]

    and Drori, Yoel , bibtex_show =

    Taylor, Adrien B. and Drori, Yoel , bibtex_show =. An optimal gradient method for smooth strongly convex minimization , year =. Mathematical Programming , pages =

  32. [32]

    and Taylor, Adrien B

    Ryu, Ernest K. and Taylor, Adrien B. and Bergeling, Carolina and Giselsson, Pontus , bibtex_show =. Operator splitting performance estimation: Tight contraction factors and optimal parameter selection , volume =. SIAM Journal on Optimization , number =

  33. [33]

    Worst-case convergence analysis of inexact gradient and Newton methods through semidefinite programming performance estimation , volume =

    De Klerk, Etienne and Glineur, Fran. Worst-case convergence analysis of inexact gradient and Newton methods through semidefinite programming performance estimation , volume =. SIAM Journal on Optimization , number =

  34. [34]

    Exact worst-case convergence rates of the proximal gradient method for composite convex minimization , volume =

    Taylor, Adrien B and Hendrickx, Julien M and Glineur, Fran. Exact worst-case convergence rates of the proximal gradient method for composite convex minimization , volume =. Journal of Optimization Theory and Applications , number =

  35. [35]

    Exact worst-case performance of first-order methods for composite convex optimization , volume =

    Taylor, Adrien B and Hendrickx, Julien M and Glineur, Fran. Exact worst-case performance of first-order methods for composite convex optimization , volume =. SIAM Journal on Optimization , number =

  36. [36]

    On the worst-case complexity of the gradient method with exact line search for smooth strongly convex functions , volume =

    De Klerk, Etienne and Glineur, Fran. On the worst-case complexity of the gradient method with exact line search for smooth strongly convex functions , volume =. Optimization Letters , number =

  37. [37]

    Smooth strongly convex interpolation and exact worst-case performance of first-order methods , volume =

    Taylor, Adrien B and Hendrickx, Julien M and Glineur, Fran. Smooth strongly convex interpolation and exact worst-case performance of first-order methods , volume =. Mathematical Programming , number =

  38. [38]

    and d'Aspremont, Alexandre and Bolte, J

    Dragomir, Radu-Alexandru and Taylor, Adrien B. and d'Aspremont, Alexandre and Bolte, J. Optimal complexity and certification of Bregman first-order methods , year =. Mathematical Programming , pages =

  39. [39]

    Truncated singular value decomposition solutions to discrete ill-posed problems with ill-determined numerical rank , volume =

    Hansen, Per Christian , date-added =. Truncated singular value decomposition solutions to discrete ill-posed problems with ill-determined numerical rank , volume =. SIAM Journal on Scientific and Statistical Computing , number =

  40. [40]

    Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions , volume =

    Halko, Nathan and Martinsson, Per-Gunnar and Tropp, Joel A , date-added =. Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions , volume =. SIAM review , number =

  41. [41]

    Naive feature selection: Sparsity in naive bayes , year =

    Askari, Armin and d'Aspremont, Alexandre and El Ghaoui, Laurent , booktitle =. Naive feature selection: Sparsity in naive bayes , year =

  42. [42]

    arXiv preprint arXiv:2102.06742 , title =

    Askari, Armin and d'Aspremont, Alexandre and Ghaoui, Laurent El , date-added =. arXiv preprint arXiv:2102.06742 , title =

  43. [43]

    Asymptotic behavior of products Cp= C+...+C in locally compact abelian groups , volume =

    Emerson, William R and Greenleaf, Frederick P , date-added =. Asymptotic behavior of products Cp= C+...+C in locally compact abelian groups , volume =. Transactions of the American Mathematical Society , pages =

  44. [44]

    Strong convexity of sets and functions , volume =

    Vial, Jean-Philippe , date-added =. Strong convexity of sets and functions , volume =. Journal of Mathematical Economics , number =

  45. [45]

    Vector extrapolation methods with applications , year =

    Sidi, Avram , date-added =. Vector extrapolation methods with applications , year =

  46. [46]

    Globally convergent type-I Anderson acceleration for nonsmooth fixed-point iterations , volume =

    Zhang, Junzi and O'Donoghue, Brendan and Boyd, Stephen , date-added =. Globally convergent type-I Anderson acceleration for nonsmooth fixed-point iterations , volume =. SIAM Journal on Optimization , number =

  47. [47]

    Anderson acceleration of the alternating projections method for computing the nearest correlation matrix , volume =

    Higham, Nicholas J and Strabi. Anderson acceleration of the alternating projections method for computing the nearest correlation matrix , volume =. Numerical Algorithms , number =

  48. [48]

    MiKM: multi-step inertial Krasnoselskii--Mann algorithm and its applications , volume =

    Dong, Qiao-Li and Huang, JZ and Li, XH and Cho, YJ and Rassias, Th M , date-added =. MiKM: multi-step inertial Krasnoselskii--Mann algorithm and its applications , volume =. Journal of Global Optimization , number =

  49. [49]

    Quasi-nonexpansive iterations on the affine hull of orbits: from Mann's mean value algorithm to inertial methods , volume =

    Combettes, Patrick L and Glaudin, Lilian E , date-added =. Quasi-nonexpansive iterations on the affine hull of orbits: from Mann's mean value algorithm to inertial methods , volume =. Siam Journal on Optimization , number =

  50. [50]

    Chebyshev acceleration techniques for solving nonsymmetric eigenvalue problems , volume =

    Saad, Youcef , date-added =. Chebyshev acceleration techniques for solving nonsymmetric eigenvalue problems , volume =. Mathematics of Computation , number =

  51. [51]

    Nearly optimal first-order methods for convex optimization under gradient norm measure: An adaptive regularization approach , year =

    Ito, Masaru and Fukuda, Mituhiro , date-added =. Nearly optimal first-order methods for convex optimization under gradient norm measure: An adaptive regularization approach , year =. Journal of Optimization Theory and Applications , pages =

  52. [52]

    Numerical determination of fundamental modes , volume =

    Flanders, Donald A and Shortley, George , date-added =. Numerical determination of fundamental modes , volume =. Journal of Applied Physics , number =

  53. [53]

    How to make the gradients small , year =

    Nesterov, Yurii , date-added =. How to make the gradients small , year =. Optima. Mathematical Optimization Society Newsletter , number =

  54. [54]

    A well-conditioned estimator for large-dimensional covariance matrices , volume =

    Ledoit, Olivier and Wolf, Michael , date-added =. A well-conditioned estimator for large-dimensional covariance matrices , volume =. Journal of multivariate analysis , number =

  55. [55]

    Why are big data matrices approximately low rank? , volume =

    Udell, Madeleine and Townsend, Alex , date-added =. Why are big data matrices approximately low rank? , volume =. SIAM Journal on Mathematics of Data Science , number =

  56. [56]

    arXiv preprint arXiv:1907.09547 , title =

    Davis, Damek and Drusvyatskiy, Dmitriy and Charisopoulos, Vasileios , date-added =. arXiv preprint arXiv:1907.09547 , title =

  57. [57]

    Iterative procedures for nonlinear integral equations , volume =

    Anderson, Donald G , date-added =. Iterative procedures for nonlinear integral equations , volume =. Journal of the ACM (JACM) , number =

  58. [58]

    arXiv preprint arXiv:2010.15482 , title =

    Barr. arXiv preprint arXiv:2010.15482 , title =

  59. [59]

    Chebyshev polynomials , year =

    Mason, John C and Handscomb, David C , date-added =. Chebyshev polynomials , year =

  60. [60]

    Hybrid deterministic-stochastic methods for data fitting , volume =

    Friedlander, Michael P and Schmidt, Mark , date-added =. Hybrid deterministic-stochastic methods for data fitting , volume =. SIAM Journal on Scientific Computing , number =

  61. [61]

    ECOS: An SOCP solver for embedded systems , year =

    Domahidi, Alexander and Chu, Eric and Boyd, Stephen , booktitle =. ECOS: An SOCP solver for embedded systems , year =

  62. [62]

    A Continous Exact _0 penalty (CEL0) for least squares regularized problem , volume =

    Soubies, Emmanuel and Blanc-F. A Continous Exact _0 penalty (CEL0) for least squares regularized problem , volume =. SIAM J. Imaging Sci , pages =

  63. [63]

    Sinkhorn distances: Lightspeed computation of optimal transport , year =

    Cuturi, Marco , booktitle =. Sinkhorn distances: Lightspeed computation of optimal transport , year =

  64. [64]

    The use of entropy maximising models, in the theory of trip distribution, mode split and route split , year =

    Wilson, Alan Geoffrey , date-added =. The use of entropy maximising models, in the theory of trip distribution, mode split and route split , year =. Journal of transport economics and policy , pages =

  65. [65]

    Wasserstein barycenter and its application to texture mixing , year =

    Rabin, Julien and Peyr. Wasserstein barycenter and its application to texture mixing , year =. International Conference on Scale Space and Variational Methods in Computer Vision , date-added =

  66. [66]

    Barycenters in the Wasserstein space , volume =

    Agueh, Martial and Carlier, Guillaume , date-added =. Barycenters in the Wasserstein space , volume =. SIAM Journal on Mathematical Analysis , number =

  67. [67]

    Computational Optimal Transport: With Applications to Data Science , volume =

    Peyr. Computational Optimal Transport: With Applications to Data Science , volume =. Foundations and Trends in Machine Learning , number =

  68. [68]

    The global methane budget 2000--2017 , volume =

    Saunois, Marielle and Stavert, Ann R and Poulter, Ben and Bousquet, Philippe and Canadell, Josep G and Jackson, Robert B and Raymond, Peter A and Dlugokencky, Edward J and Houweling, Sander and Patra, Prabir K and others , date-added =. The global methane budget 2000--2017 , volume =. Earth System Science Data , number =

  69. [69]

    Distributed algorithms via gradient descent for fisher markets , year =

    Birnbaum, Benjamin and Devanur, Nikhil R and Xiao, Lin , booktitle =. Distributed algorithms via gradient descent for fisher markets , year =

  70. [70]

    Relatively smooth convex optimization by first-order methods, and applications , volume =

    Lu, Haihao and Freund, Robert M and Nesterov, Yurii , date-added =. Relatively smooth convex optimization by first-order methods, and applications , volume =. SIAM Journal on Optimization , number =

  71. [71]

    Universal method for stochastic composite optimization problems , volume =

    Gasnikov, Alexander Vladimirovich and Nesterov, Yu E , date-added =. Universal method for stochastic composite optimization problems , volume =. Computational Mathematics and Mathematical Physics , number =

  72. [72]

    arXiv preprint arXiv:1805.12591 , title =

    Cohen, Michael B and Diakonikolas, Jelena and Orecchia, Lorenzo , date-added =. arXiv preprint arXiv:1805.12591 , title =

  73. [73]

    Strong and weak convexity of sets and functions , volume =

    Vial, Jean-Philippe , date-added =. Strong and weak convexity of sets and functions , volume =. Mathematics of Operations Research , number =

  74. [74]

    R-convexity of the integral of set-valued functions , year =

    Frankowska, Halina and Olech, Czes. R-convexity of the integral of set-valued functions , year =. Contributions to analysis and geometry , pages =

  75. [75]

    Strongly convex analysis , volume =

    Polovinkin, Evgenii Sergeevich , date-added =. Strongly convex analysis , volume =. Sbornik: Mathematics , number =

  76. [76]

    On strongly convex sets and strongly convex functions , volume =

    Polovinkin, ES , date-added =. On strongly convex sets and strongly convex functions , volume =. Journal of Mathematical Sciences , number =

  77. [77]

    Exact post-selection inference, with application to the lasso , volume =

    Lee, Jason D and Sun, Dennis L and Sun, Yuekai and Taylor, Jonathan E and others , date-added =. Exact post-selection inference, with application to the lasso , volume =. The Annals of Statistics , number =

  78. [78]

    Valid post-selection inference , volume =

    Berk, Richard and Brown, Lawrence and Buja, Andreas and Zhang, Kai and Zhao, Linda and others , date-added =. Valid post-selection inference , volume =. The Annals of Statistics , number =

  79. [79]

    Controlling the false discovery rate: a practical and powerful approach to multiple testing , volume =

    Benjamini, Yoav and Hochberg, Yosef , date-added =. Controlling the false discovery rate: a practical and powerful approach to multiple testing , volume =. Journal of the Royal statistical society: series B (Methodological) , number =

  80. [80]

    Neighbourhood Retractions of Nonconvex Sets in a Hilbert Space via Sublinear Functionals , volume =

    Goncharov, Vladimir V and Pereira, F. Neighbourhood Retractions of Nonconvex Sets in a Hilbert Space via Sublinear Functionals , volume =. Journal of Convex Analysis , number =

Showing first 80 references.