Accelerated Convex Optimization via Hamiltonian Dynamics with Deterministic Integration Time

Andre Wibisono; Ashia Wilson; Qiang Fu; Siddharth Mitra; Vishwak Srinivasan; Xiuyuan Wang

arxiv: 2606.17260 · v1 · pith:VISRCN3Qnew · submitted 2026-06-15 · 🧮 math.OC · cs.LG· stat.ML

Accelerated Convex Optimization via Hamiltonian Dynamics with Deterministic Integration Time

Xiuyuan Wang , Vishwak Srinivasan , Qiang Fu , Siddharth Mitra , Ashia Wilson , Andre Wibisono This is my paper

Pith reviewed 2026-06-27 02:12 UTC · model grok-4.3

classification 🧮 math.OC cs.LGstat.ML

keywords Hamiltonian dynamicsconvex optimizationaccelerated convergencedeterministic ratesfirst-order methodssmooth convex functionscontinuous-time algorithms

0 comments

The pith

Averaging Hamiltonian flow trajectories yields deterministic accelerated convergence for smooth convex optimization.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that Hamiltonian dynamics can be turned into optimization algorithms that reach accelerated rates in a fully deterministic way for smooth convex problems. It does so by proving contraction on the average of the flow trajectories rather than at their individual endpoints. This removes the restriction to quadratic objectives or to convergence only in expectation that limited earlier Hamiltonian approaches. The authors first analyze an idealized continuous-time version and then give discrete-time versions that match the optimal first-order complexity of Nesterov acceleration.

Core claim

Exploiting contraction of averaged Hamiltonian flow trajectories rather than requiring contraction at trajectory endpoints suffices to obtain deterministic accelerated convergence guarantees for smooth convex optimization; the resulting continuous-time idealization and its practical discrete-time implementations achieve optimal first-order complexity.

What carries the argument

Averaged Hamiltonian flow trajectories whose contraction properties drive the accelerated rate.

If this is right

An idealized continuous-time Hamiltonian dynamics algorithm converges at the accelerated rate.
Practical discrete-time implementations achieve the optimal first-order iteration complexity.
The approach extends prior Hamiltonian methods beyond quadratic objectives and beyond convergence in expectation.
Hamiltonian dynamics becomes a usable primitive for deterministic accelerated convex optimization.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The averaging idea might transfer to other continuous-time dynamics used in optimization, such as gradient flows with momentum.
It could simplify the design of accelerated methods when only noisy or partial trajectory information is available.
Similar averaging arguments might remove randomness requirements in related stochastic or randomized algorithms.

Load-bearing premise

Contraction on the average of the trajectories is enough to guarantee the accelerated rate even if individual trajectories do not contract at their endpoints.

What would settle it

A smooth convex function on which the averaged Hamiltonian flow contracts yet the optimization error fails to decay at the accelerated rate.

Figures

Figures reproduced from arXiv: 2606.17260 by Andre Wibisono, Ashia Wilson, Qiang Fu, Siddharth Mitra, Vishwak Srinivasan, Xiuyuan Wang.

**Figure 2.** Figure 2: Evolution of function values with time t. We plot f avg and f s−avg because the former is the quantity that we bound; see Remark 1, and the latter is the function value analogue of Xs−avg. We note two key observations from this plot: (a) the function values along the aggregated position decay to 0, and (b) f avg(X0;t), f s−avg(X0;t) oscillates about 1 2 f(X0) with varying amplitudes. The first observat… view at source ↗

**Figure 3.** Figure 3: The dotted lines are the boundaries of the [PITH_FULL_IMAGE:figures/full_fig_p031_3.png] view at source ↗

**Figure 4.** Figure 4: Evolution of function values with time t. Here, the dotted line represents 1 2 f(X0). We see that the time averages of the function values f avg(X0;t) and f s−avg(X0;t) oscillates about a value that is slightly higher than this line with varying amplitudes. This does not invalidate the result of Corollary 1 which suggests the level 2 3 f(X0) ≈ 0.48. Consistent with the variation shows in [PITH_FULL_IMAGE… view at source ↗

**Figure 5.** Figure 5: Comparison between GD ( ), AGD ( ), and dHFA-eg ( • ). γk is set to 1− √ηα 1+√ηα and for dHFA-eg, λ = 0 and Nk = ⌈ 2 η √ α ⌉ for all iterations k according to Corollary 4. The step size η to set to 1 L for GD and AGD, and to √ 1 L for dHFA-eg for the linear regression problem. For the logistic regression problem, we performed grid search on a collection of 25 uniformly log-spaced values from 10−5 to 1 to d… view at source ↗

**Figure 6.** Figure 6: Comparison between averaging schemes. The plots depict three curves per setting: [PITH_FULL_IMAGE:figures/full_fig_p051_6.png] view at source ↗

**Figure 7.** Figure 7: Comparison between integrators. The plots depict three curves per setting: [PITH_FULL_IMAGE:figures/full_fig_p051_7.png] view at source ↗

read the original abstract

We develop Hamiltonian dynamics-based algorithms for smooth convex optimization that achieve accelerated rates of convergence. By exploiting contraction of averaged Hamiltonian flow trajectories rather than requiring contraction at trajectory endpoints, we show that Hamiltonian dynamics-based optimization methods admit deterministic and accelerated convergence guarantees, extending prior work that is limited to quadratic objectives or holds only in expectation. We analyze an idealized continuous-time algorithm and derive practical discrete-time implementations with optimal first-order complexity, thereby establishing Hamiltonian dynamics as a useful algorithmic primitive for deterministic accelerated convex optimization.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper claims deterministic accelerated rates for Hamiltonian optimization by contracting averaged trajectories, but the discretization step is the part that needs the most scrutiny.

read the letter

The main point is that this work gets deterministic accelerated convergence guarantees for smooth convex optimization by exploiting contraction on averaged Hamiltonian flow trajectories instead of requiring it at the endpoints. That lets them move past the quadratic or expectation-only limits in earlier results.

They lay out the continuous-time analysis cleanly and then produce discrete implementations that hit the optimal first-order complexity. The averaged-trajectory device is a sensible technical step for securing determinism while keeping acceleration, and the paper treats the continuous case as an idealized primitive before discretizing.

The soft spot is exactly the one the stress-test flags. The claim depends on the discrete updates in section 4 preserving the averaged contraction without error terms that would force the rates back to expectation only. If those bounds are tight and explicit, the deterministic guarantee holds; if they are loose or absent, the main selling point weakens. Nothing else in the abstract suggests circularity or missing citations.

This is for readers already working on continuous-time views of accelerated methods or looking for deterministic first-order algorithms. Someone building reliable training pipelines could find the perspective useful as a starting point. The work is concrete enough to deserve a serious referee so the discretization details can be checked directly.

Referee Report

2 major / 1 minor

Summary. The manuscript develops Hamiltonian dynamics-based algorithms for smooth convex optimization that achieve accelerated rates of convergence. By exploiting contraction of averaged Hamiltonian flow trajectories rather than requiring contraction at trajectory endpoints, the authors claim that these methods admit deterministic and accelerated convergence guarantees, extending prior work limited to quadratic objectives or holding only in expectation. The paper analyzes an idealized continuous-time algorithm and derives practical discrete-time implementations with optimal first-order complexity.

Significance. If the central claims hold, the work would establish Hamiltonian dynamics as a useful algorithmic primitive for deterministic accelerated convex optimization, providing a deterministic alternative to expectation-based analyses and extending beyond quadratic cases.

major comments (2)

[§3] §3: The claim that contraction of averaged Hamiltonian flow trajectories suffices to obtain deterministic accelerated rates requires an explicit Lyapunov function or contraction mapping derivation showing the rate; the high-level description supplies no such supporting argument or bound.
[§4] §4: The discretization step must include explicit error bounds demonstrating that the averaged-trajectory contraction property transfers to the discrete endpoints used for the optimization iterates without introducing bias or variance terms that revert the guarantee to expectation-only; no such bounds are indicated.

minor comments (1)

The abstract would benefit from stating the precise accelerated rate (e.g., O(1/k²)) achieved by the discrete implementations to support the optimality claim.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. The two major comments identify areas where the derivations can be presented more explicitly. We will revise the manuscript to address both points directly.

read point-by-point responses

Referee: [§3] The claim that contraction of averaged Hamiltonian flow trajectories suffices to obtain deterministic accelerated rates requires an explicit Lyapunov function or contraction mapping derivation showing the rate; the high-level description supplies no such supporting argument or bound.

Authors: We agree that an explicit derivation strengthens the claim. In the revision we will add a self-contained subsection in §3 that constructs the Lyapunov function V(t) = t²(f(x(t)) − f*) + ‖y(t) − x*‖² and computes its derivative along the averaged flow, yielding dV/dt ≤ −c t (f(x) − f*) and the deterministic rate O(1/t²) without expectation. revision: yes
Referee: [§4] The discretization step must include explicit error bounds demonstrating that the averaged-trajectory contraction property transfers to the discrete endpoints used for the optimization iterates without introducing bias or variance terms that revert the guarantee to expectation-only; no such bounds are indicated.

Authors: We accept the request for explicit bounds. The revised §4 will include a discretization-error lemma that bounds the difference between the continuous averaged trajectory and the discrete iterates by O(h) (step-size) terms; these additive errors are absorbed into the Lyapunov analysis without introducing stochastic bias, preserving the deterministic accelerated guarantee. revision: yes

Circularity Check

0 steps flagged

No circularity; derivation self-contained against external benchmarks

full rationale

The abstract and description present a continuous-time Hamiltonian analysis followed by discrete implementations achieving optimal rates, with the key step being contraction of averaged trajectories rather than endpoints. No equations, self-citations, fitted parameters, or ansatzes are provided in the visible text that would allow reduction of any claimed rate to an input by construction. The approach extends prior work on quadratics or stochastic settings without invoking load-bearing self-citations or renaming known results. This is the standard case of an independent derivation whose validity rests on external verification of the contraction and discretization arguments rather than internal redefinition.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No details on free parameters, axioms, or invented entities are present in the abstract.

pith-pipeline@v0.9.1-grok · 5620 in / 974 out tokens · 30402 ms · 2026-06-27T02:12:40.138283+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

90 extracted references · 2 canonical work pages

[1]

arXiv preprint arXiv:2507.13259 , year=

On Accelerated Mixing of the No-U-turn Sampler , author=. arXiv preprint arXiv:2507.13259 , year=

arXiv
[2]

Electronic Communications in Probability , VOLUME =

Rudelson, Mark and Vershynin, Roman , TITLE =. Electronic Communications in Probability , VOLUME =. 2013 , PAGES =

2013
[3]

, title =

Diakonikolas, Jelena and Jordan, Michael I. , title =. SIAM Journal on Optimization , volume =
[4]

2021 , eprint=

A Continuized View on Nesterov Acceleration for Stochastic Gradient Descent and Randomized Gossip , author=. 2021 , eprint=

2021
[5]

Hamiltonian Descent Algorithms for Optimization: Accelerated Rates via Randomized Integration Time , volume =

Fu, Qiang and Wibisono, Andre , booktitle =. Hamiltonian Descent Algorithms for Optimization: Accelerated Rates via Randomized Integration Time , volume =
[6]

Virtanen, Pauli and Gommers, Ralf and Oliphant, Travis E and Haberland, Matt and Reddy, Tyler and Cournapeau, David and Burovski, Evgeni and Peterson, Pearu and Weckesser, Warren and Bright, Jonathan and others , journal=. Sci. 2020 , publisher=

2020
[7]

Jun-Kun Wang , year=
[8]

2023 , eprint=

Accelerating Hamiltonian Monte Carlo via Chebyshev Integration Time , author=. 2023 , eprint=

2023
[9]

Theory of Computing , volume=

Optimal convergence rate of Hamiltonian Monte Carlo for strongly logconcave distributions , author=. Theory of Computing , volume=. 2022 , publisher=

2022
[10]

2019 , eprint=

Optimal Convergence Rate of Hamiltonian Monte Carlo for Strongly Logconcave Distributions , author=. 2019 , eprint=

2019
[11]

The Annals of Applied Probability , publisher=

Bou-Rabee, Nawaf and Sanz-Serna, Jesús María , year=. The Annals of Applied Probability , publisher=
[12]

Oberwolfach Reports , volume=

Geometric numerical integration , author=. Oberwolfach Reports , volume=
[13]

Accelerating Rescaled Gradient Descent: Fast Optimization of Smooth Functions , volume =

Wilson, Ashia and Mackey, Lester and Wibisono, Andre , booktitle =. Accelerating Rescaled Gradient Descent: Fast Optimization of Smooth Functions , volume =
[14]

Weijie Su and Stephen Boyd and Emmanuel J. Cand. Journal of Machine Learning Research , year =
[15]

2018 , eprint=

Hamiltonian Descent Methods , author=. 2018 , eprint=

2018
[16]

Vempala and Andre Wibisono , title = "

Santosh S. Vempala and Andre Wibisono , title = ". Geometric Aspects of Functional Analysis , publisher =. 2023 , pages =

2023
[17]

The variational formulation of the

Jordan, Richard and Kinderlehrer, David and Otto, Felix , journal=. The variational formulation of the. 1998 , publisher=

1998
[18]

Proceedings of the 31st Conference On Learning Theory , pages =

Sampling as optimization in the space of measures: The Langevin dynamics as a composite optimization problem , author =. Proceedings of the 31st Conference On Learning Theory , pages =. 2018 , volume =

2018
[19]

Roberts and Richard L

Gareth O. Roberts and Richard L. Tweedie , journal =. Exponential convergence of Langevin distributions and their discrete approximations
[20]

Further and stronger analogy between sampling and optimization: Langevin Monte Carlo and gradient descent

Dalalyan, Arnak , booktitle=. Further and stronger analogy between sampling and optimization: Langevin Monte Carlo and gradient descent. 2017 , organization=

2017
[21]

Introductory Lectures on Convex Optimization: A Basic Course

Nesterov, Yurii , year=. Introductory Lectures on Convex Optimization: A Basic Course
[22]

2004 , publisher=

Convex optimization , author=. 2004 , publisher=

2004
[23]

Hybrid Monte Carlo

Duane, Simon and Kennedy, Anthony D and Pendleton, Brian J and Roweth, Duncan , journal=. Hybrid Monte Carlo. 1987 , publisher=

1987
[24]

MCMC using Hamiltonian dynamics

Neal, Radford M , journal=". MCMC using Hamiltonian dynamics. 2011 , publisher=

2011
[25]

Stan: A probabilistic programming language

Carpenter, Bob and Gelman, Andrew and Hoffman, Matthew D and Lee, Daniel and Goodrich, Ben and Betancourt, Michael and Brubaker, Marcus and Guo, Jiqiang and Li, Peter and Riddell, Allen , journal=. Stan: A probabilistic programming language
[26]

The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo

Hoffman, Matthew D and Gelman, Andrew , journal=. The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo
[27]

On the dissipation of ideal Hamiltonian Monte Carlo sampler

Jiang, Qijia , journal=. On the dissipation of ideal Hamiltonian Monte Carlo sampler. 2023 , publisher=

2023
[28]

Hamiltonian Monte Carlo for efficient Gaussian sampling: long and random steps

Apers, Simon and Gribling, Sander and Szil. Hamiltonian Monte Carlo for efficient Gaussian sampling: long and random steps. Journal of Machine Learning Research , volume=
[29]

2019 , organization=

Teel, Andrew R and Poveda, Jorge I and Le, Justin , booktitle=. 2019 , organization=

2019
[30]

2021 , publisher=

Diakonikolas, Jelena and Jordan, Michael I , journal=. 2021 , publisher=

2021
[31]

The exact information-based complexity of smooth convex minimization , journal =

Yoel Drori , keywords =. The exact information-based complexity of smooth convex minimization , journal =. 2017 , issn =. doi:https://doi.org/10.1016/j.jco.2016.11.001 , url =

work page doi:10.1016/j.jco.2016.11.001 2017
[32]

Kim, Donghwan and Fessler, Jeffrey A. , year=. Optimized first-order methods for smooth convex minimization , volume=. Mathematical Programming , publisher=. doi:10.1007/s10107-015-0949-3 , number=

work page doi:10.1007/s10107-015-0949-3
[33]

Nesterov, Yurii , booktitle=
[34]

2018 , publisher=

Lectures on convex optimization , author=. 2018 , publisher=

2018
[35]

USSR Computational Mathematics and Mathematical Physics , volume=

Some methods of speeding up the convergence of iteration methods , author=. USSR Computational Mathematics and Mathematical Physics , volume=. 1964 , publisher=

1964
[36]

Proceedings of the National Academy of Sciences , volume=

A variational perspective on accelerated methods in optimization , author=. Proceedings of the National Academy of Sciences , volume=. 2016 , publisher=

2016
[37]

Wilson, Ashia C and Recht, Ben and Jordan, Michael I , journal=
[38]

De Luca, G Bruno and Gatti, Alice and Silverstein, Eva , journal=
[39]

Kingma, Diederik P and Ba, Jimmy , booktitle=
[40]

International Conference on Machine Learning , pages=

Continuous-time analysis of accelerated gradient methods via conservation laws in dilated coordinate systems , author=. International Conference on Machine Learning , pages=. 2022 , organization=

2022
[41]

Nesterov, Yurii , title =
[42]

, title =

Dalalyan, Arnak S. , title =. Journal of the Royal Statistical Society: Series B (Statistical Methodology) , volume =
[43]

Continuized Accelerations of Deterministic and Stochastic Gradient Descents, and of Gossip Algorithms , volume =

Even, Mathieu and Berthier, Rapha\". Continuized Accelerations of Deterministic and Stochastic Gradient Descents, and of Gossip Algorithms , volume =. Advances in Neural Information Processing Systems , pages =
[44]

Chatterji and Xiang Cheng and Nicolas Flammarion and Peter L

Yi-An Ma and Niladri S. Chatterji and Xiang Cheng and Nicolas Flammarion and Peter L. Bartlett and Michael I. Jordan , title =. Bernoulli , number =
[45]

Journal of Machine Learning Research , volume=

Durmus, Alain and Majewski, Szymon and Miasojedow, B. Journal of Machine Learning Research , volume=
[46]

Proceedings of Thirty Fourth Conference on Learning Theory , pages =

Structured Logconcave Sampling with a Restricted Gaussian Oracle , author =. Proceedings of Thirty Fourth Conference on Learning Theory , pages =. 2021 , volume =

2021
[47]

Wright , title =

Shi Chen and Qin Li and Oliver Tse and Stephen J. Wright , title =. Journal of Machine Learning Research , year =
[48]

International Conference on Machine Learning , pages=

Forward-backward Gaussian variational inference via JKO in the Bures-Wasserstein space , author=. International Conference on Machine Learning , pages=. 2023 , organization=

2023
[49]

Cauchy, Augustin , journal =. M
[50]

Betancourt, Michael , journal=
[51]

Proceedings of Thirty Seventh Conference on Learning Theory , pages =

Large Stepsize Gradient Descent for Logistic Loss: Non-Monotonicity of the Loss Improves Optimization Efficiency , author =. Proceedings of Thirty Seventh Conference on Learning Theory , pages =. 2024 , volume =

2024
[52]

The Thirty-ninth Annual Conference on Neural Information Processing Systems , year=

Large Stepsizes Accelerate Gradient Descent for Regularized Logistic Regression , author=. The Thirty-ninth Annual Conference on Neural Information Processing Systems , year=
[53]

Stochastic Proximal Langevin Algorithm: Potential Splitting and Nonasymptotic Rates , volume =

Salim, Adil and Kovalev, Dmitry and Richtarik, Peter , booktitle =. Stochastic Proximal Langevin Algorithm: Potential Splitting and Nonasymptotic Rates , volume =
[54]

Mixing time guarantees for unadjusted Hamiltonian Monte Carlo

Bou-Rabee, Nawaf and Eberle, Andreas , journal=. Mixing time guarantees for unadjusted Hamiltonian Monte Carlo. 2023 , publisher=

2023
[55]

Convergence of unadjusted Hamiltonian Monte Carlo for mean-field models

Bou-Rabee, Nawaf and Schuh, Katharina , journal=. Convergence of unadjusted Hamiltonian Monte Carlo for mean-field models. 2023 , publisher=

2023
[56]

Tail-Sensitive KL and R\'enyi Convergence of Unadjusted Hamiltonian Monte Carlo via One-Shot Couplings

Bou-Rabee, Nawaf and Mitra, Siddharth and Wibisono, Andre , journal=. Tail-Sensitive KL and R\'enyi Convergence of Unadjusted Hamiltonian Monte Carlo via One-Shot Couplings
[57]

Bou-Rabee, Nawaf and Sanz-Serna, J. M. , year=. Acta Numerica , publisher=
[58]

2006 , PAGES =

Hairer, Ernst and Lubich, Christian and Wanner, Gerhard , TITLE =. 2006 , PAGES =

2006
[59]

Wasserstein

Zhang, Kelvin Shuangjian and Peyr. Wasserstein. Proceedings of Thirty Third Conference on Learning Theory , pages =. 2020 , volume =

2020
[60]

Proceedings of The 36th International Conference on Algorithmic Learning Theory , pages =

High-accuracy sampling from constrained spaces with the Metropolis-adjusted Preconditioned Langevin Algorithm , author =. Proceedings of The 36th International Conference on Algorithmic Learning Theory , pages =. 2025 , volume =

2025
[61]

Simulating Hamiltonian Dynamics , publisher=

Leimkuhler, Benedict and Reich, Sebastian , year=. Simulating Hamiltonian Dynamics , publisher=
[62]

Journal of the ACM , volume=

Acceleration by stepsize hedging: Multi-step descent and the silver stepsize schedule , author=. Journal of the ACM , volume=. 2025 , publisher=

2025
[63]

Mathematical Programming , volume=

Acceleration by stepsize hedging: Silver Stepsize Schedule for smooth convex optimization , author=. Mathematical Programming , volume=. 2025 , publisher=

2025
[64]

Advances in Neural Information Processing Systems , volume=

Accelerated mirror descent in continuous and discrete time , author=. Advances in Neural Information Processing Systems , volume=
[65]

Dissipativity Theory for

Bin Hu and Laurent Lessard , booktitle =. Dissipativity Theory for. 2017 , volume =

2017
[66]

arXiv preprint arXiv:1809.05042 , year=

Hamiltonian descent methods , author=. arXiv preprint arXiv:1809.05042 , year=

Pith/arXiv arXiv
[67]

A Dynamical Systems Perspective on

Muehlebach, Michael and Jordan, Michael , booktitle =. A Dynamical Systems Perspective on. 2019 , volume =

2019
[68]

Advances in Neural Information Processing Systems , volume=

Hamiltonian descent for composite objectives , author=. Advances in Neural Information Processing Systems , volume=
[69]

Conference on Learning Theory , pages=

Near-optimal methods for minimizing star-convex functions and beyond , author=. Conference on Learning Theory , pages=. 2020 , organization=

2020
[70]

Mathematical Programming , pages=

Understanding the acceleration phenomenon via high-resolution differential equations , author=. Mathematical Programming , pages=. 2022 , publisher=

2022
[71]

International Conference on Machine Learning , pages=

Accelerated stochastic optimization methods under quasar-convexity , author=. International Conference on Machine Learning , pages=. 2023 , organization=

2023
[72]

International Conference on Learning and Representations , year=

Continuized acceleration for quasar convex functions in non-convex optimization , author=. International Conference on Learning and Representations , year=
[73]

2022 , publisher=

Attouch, Hedy and Chbani, Zaki and Fadili, Jalal and Riahi, Hassan , journal=. 2022 , publisher=

2022
[74]

Foundations and Trends in Optimization , volume=

Acceleration methods , author=. Foundations and Trends in Optimization , volume=. 2021 , publisher=

2021
[75]

SIAM Journal on Optimization , volume =

Grimmer, Benjamin , title =. SIAM Journal on Optimization , volume =
[76]

Mathematical Programming , volume=

Performance of first-order methods for smooth convex minimization: a novel approach , author=. Mathematical Programming , volume=. 2014 , publisher=

2014
[77]

A geometric alternative to

Bubeck, S. A geometric alternative to. arXiv preprint arXiv:1506.08187 , year=

Pith/arXiv arXiv
[78]

SIAM Journal on Optimization , volume=

An optimal first order method based on optimal quadratic averaging , author=. SIAM Journal on Optimization , volume=. 2018 , publisher=

2018
[79]

Linear Convergence of Gradient and Proximal-Gradient Methods Under the Polyak- ojasiewicz Condition

Karimi, Hamed and Nutini, Julie and Schmidt, Mark. Linear Convergence of Gradient and Proximal-Gradient Methods Under the Polyak- ojasiewicz Condition. Machine Learning and Knowledge Discovery in Databases. 2016

2016
[80]

arXiv preprint arXiv:1303.4645 , year=

Gradient methods for convex minimization: better rates under weaker conditions , author=. arXiv preprint arXiv:1303.4645 , year=

Pith/arXiv arXiv

Showing first 80 references.

[1] [1]

arXiv preprint arXiv:2507.13259 , year=

On Accelerated Mixing of the No-U-turn Sampler , author=. arXiv preprint arXiv:2507.13259 , year=

arXiv

[2] [2]

Electronic Communications in Probability , VOLUME =

Rudelson, Mark and Vershynin, Roman , TITLE =. Electronic Communications in Probability , VOLUME =. 2013 , PAGES =

2013

[3] [3]

, title =

Diakonikolas, Jelena and Jordan, Michael I. , title =. SIAM Journal on Optimization , volume =

[4] [4]

2021 , eprint=

A Continuized View on Nesterov Acceleration for Stochastic Gradient Descent and Randomized Gossip , author=. 2021 , eprint=

2021

[5] [5]

Hamiltonian Descent Algorithms for Optimization: Accelerated Rates via Randomized Integration Time , volume =

Fu, Qiang and Wibisono, Andre , booktitle =. Hamiltonian Descent Algorithms for Optimization: Accelerated Rates via Randomized Integration Time , volume =

[6] [6]

Virtanen, Pauli and Gommers, Ralf and Oliphant, Travis E and Haberland, Matt and Reddy, Tyler and Cournapeau, David and Burovski, Evgeni and Peterson, Pearu and Weckesser, Warren and Bright, Jonathan and others , journal=. Sci. 2020 , publisher=

2020

[7] [7]

Jun-Kun Wang , year=

[8] [8]

2023 , eprint=

Accelerating Hamiltonian Monte Carlo via Chebyshev Integration Time , author=. 2023 , eprint=

2023

[9] [9]

Theory of Computing , volume=

Optimal convergence rate of Hamiltonian Monte Carlo for strongly logconcave distributions , author=. Theory of Computing , volume=. 2022 , publisher=

2022

[10] [10]

2019 , eprint=

Optimal Convergence Rate of Hamiltonian Monte Carlo for Strongly Logconcave Distributions , author=. 2019 , eprint=

2019

[11] [11]

The Annals of Applied Probability , publisher=

Bou-Rabee, Nawaf and Sanz-Serna, Jesús María , year=. The Annals of Applied Probability , publisher=

[12] [12]

Oberwolfach Reports , volume=

Geometric numerical integration , author=. Oberwolfach Reports , volume=

[13] [13]

Accelerating Rescaled Gradient Descent: Fast Optimization of Smooth Functions , volume =

Wilson, Ashia and Mackey, Lester and Wibisono, Andre , booktitle =. Accelerating Rescaled Gradient Descent: Fast Optimization of Smooth Functions , volume =

[14] [14]

Weijie Su and Stephen Boyd and Emmanuel J. Cand. Journal of Machine Learning Research , year =

[15] [15]

2018 , eprint=

Hamiltonian Descent Methods , author=. 2018 , eprint=

2018

[16] [16]

Vempala and Andre Wibisono , title = "

Santosh S. Vempala and Andre Wibisono , title = ". Geometric Aspects of Functional Analysis , publisher =. 2023 , pages =

2023

[17] [17]

The variational formulation of the

Jordan, Richard and Kinderlehrer, David and Otto, Felix , journal=. The variational formulation of the. 1998 , publisher=

1998

[18] [18]

Proceedings of the 31st Conference On Learning Theory , pages =

Sampling as optimization in the space of measures: The Langevin dynamics as a composite optimization problem , author =. Proceedings of the 31st Conference On Learning Theory , pages =. 2018 , volume =

2018

[19] [19]

Roberts and Richard L

Gareth O. Roberts and Richard L. Tweedie , journal =. Exponential convergence of Langevin distributions and their discrete approximations

[20] [20]

Further and stronger analogy between sampling and optimization: Langevin Monte Carlo and gradient descent

Dalalyan, Arnak , booktitle=. Further and stronger analogy between sampling and optimization: Langevin Monte Carlo and gradient descent. 2017 , organization=

2017

[21] [21]

Introductory Lectures on Convex Optimization: A Basic Course

Nesterov, Yurii , year=. Introductory Lectures on Convex Optimization: A Basic Course

[22] [22]

2004 , publisher=

Convex optimization , author=. 2004 , publisher=

2004

[23] [23]

Hybrid Monte Carlo

Duane, Simon and Kennedy, Anthony D and Pendleton, Brian J and Roweth, Duncan , journal=. Hybrid Monte Carlo. 1987 , publisher=

1987

[24] [24]

MCMC using Hamiltonian dynamics

Neal, Radford M , journal=". MCMC using Hamiltonian dynamics. 2011 , publisher=

2011

[25] [25]

Stan: A probabilistic programming language

Carpenter, Bob and Gelman, Andrew and Hoffman, Matthew D and Lee, Daniel and Goodrich, Ben and Betancourt, Michael and Brubaker, Marcus and Guo, Jiqiang and Li, Peter and Riddell, Allen , journal=. Stan: A probabilistic programming language

[26] [26]

The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo

Hoffman, Matthew D and Gelman, Andrew , journal=. The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo

[27] [27]

On the dissipation of ideal Hamiltonian Monte Carlo sampler

Jiang, Qijia , journal=. On the dissipation of ideal Hamiltonian Monte Carlo sampler. 2023 , publisher=

2023

[28] [28]

Hamiltonian Monte Carlo for efficient Gaussian sampling: long and random steps

Apers, Simon and Gribling, Sander and Szil. Hamiltonian Monte Carlo for efficient Gaussian sampling: long and random steps. Journal of Machine Learning Research , volume=

[29] [29]

2019 , organization=

Teel, Andrew R and Poveda, Jorge I and Le, Justin , booktitle=. 2019 , organization=

2019

[30] [30]

2021 , publisher=

Diakonikolas, Jelena and Jordan, Michael I , journal=. 2021 , publisher=

2021

[31] [31]

The exact information-based complexity of smooth convex minimization , journal =

Yoel Drori , keywords =. The exact information-based complexity of smooth convex minimization , journal =. 2017 , issn =. doi:https://doi.org/10.1016/j.jco.2016.11.001 , url =

work page doi:10.1016/j.jco.2016.11.001 2017

[32] [32]

Kim, Donghwan and Fessler, Jeffrey A. , year=. Optimized first-order methods for smooth convex minimization , volume=. Mathematical Programming , publisher=. doi:10.1007/s10107-015-0949-3 , number=

work page doi:10.1007/s10107-015-0949-3

[33] [33]

Nesterov, Yurii , booktitle=

[34] [34]

2018 , publisher=

Lectures on convex optimization , author=. 2018 , publisher=

2018

[35] [35]

USSR Computational Mathematics and Mathematical Physics , volume=

Some methods of speeding up the convergence of iteration methods , author=. USSR Computational Mathematics and Mathematical Physics , volume=. 1964 , publisher=

1964

[36] [36]

Proceedings of the National Academy of Sciences , volume=

A variational perspective on accelerated methods in optimization , author=. Proceedings of the National Academy of Sciences , volume=. 2016 , publisher=

2016

[37] [37]

Wilson, Ashia C and Recht, Ben and Jordan, Michael I , journal=

[38] [38]

De Luca, G Bruno and Gatti, Alice and Silverstein, Eva , journal=

[39] [39]

Kingma, Diederik P and Ba, Jimmy , booktitle=

[40] [40]

International Conference on Machine Learning , pages=

Continuous-time analysis of accelerated gradient methods via conservation laws in dilated coordinate systems , author=. International Conference on Machine Learning , pages=. 2022 , organization=

2022

[41] [41]

Nesterov, Yurii , title =

[42] [42]

, title =

Dalalyan, Arnak S. , title =. Journal of the Royal Statistical Society: Series B (Statistical Methodology) , volume =

[43] [43]

Continuized Accelerations of Deterministic and Stochastic Gradient Descents, and of Gossip Algorithms , volume =

Even, Mathieu and Berthier, Rapha\". Continuized Accelerations of Deterministic and Stochastic Gradient Descents, and of Gossip Algorithms , volume =. Advances in Neural Information Processing Systems , pages =

[44] [44]

Chatterji and Xiang Cheng and Nicolas Flammarion and Peter L

Yi-An Ma and Niladri S. Chatterji and Xiang Cheng and Nicolas Flammarion and Peter L. Bartlett and Michael I. Jordan , title =. Bernoulli , number =

[45] [45]

Journal of Machine Learning Research , volume=

Durmus, Alain and Majewski, Szymon and Miasojedow, B. Journal of Machine Learning Research , volume=

[46] [46]

Proceedings of Thirty Fourth Conference on Learning Theory , pages =

Structured Logconcave Sampling with a Restricted Gaussian Oracle , author =. Proceedings of Thirty Fourth Conference on Learning Theory , pages =. 2021 , volume =

2021

[47] [47]

Wright , title =

Shi Chen and Qin Li and Oliver Tse and Stephen J. Wright , title =. Journal of Machine Learning Research , year =

[48] [48]

International Conference on Machine Learning , pages=

Forward-backward Gaussian variational inference via JKO in the Bures-Wasserstein space , author=. International Conference on Machine Learning , pages=. 2023 , organization=

2023

[49] [49]

Cauchy, Augustin , journal =. M

[50] [50]

Betancourt, Michael , journal=

[51] [51]

Proceedings of Thirty Seventh Conference on Learning Theory , pages =

Large Stepsize Gradient Descent for Logistic Loss: Non-Monotonicity of the Loss Improves Optimization Efficiency , author =. Proceedings of Thirty Seventh Conference on Learning Theory , pages =. 2024 , volume =

2024

[52] [52]

The Thirty-ninth Annual Conference on Neural Information Processing Systems , year=

Large Stepsizes Accelerate Gradient Descent for Regularized Logistic Regression , author=. The Thirty-ninth Annual Conference on Neural Information Processing Systems , year=

[53] [53]

Stochastic Proximal Langevin Algorithm: Potential Splitting and Nonasymptotic Rates , volume =

Salim, Adil and Kovalev, Dmitry and Richtarik, Peter , booktitle =. Stochastic Proximal Langevin Algorithm: Potential Splitting and Nonasymptotic Rates , volume =

[54] [54]

Mixing time guarantees for unadjusted Hamiltonian Monte Carlo

Bou-Rabee, Nawaf and Eberle, Andreas , journal=. Mixing time guarantees for unadjusted Hamiltonian Monte Carlo. 2023 , publisher=

2023

[55] [55]

Convergence of unadjusted Hamiltonian Monte Carlo for mean-field models

Bou-Rabee, Nawaf and Schuh, Katharina , journal=. Convergence of unadjusted Hamiltonian Monte Carlo for mean-field models. 2023 , publisher=

2023

[56] [56]

Tail-Sensitive KL and R\'enyi Convergence of Unadjusted Hamiltonian Monte Carlo via One-Shot Couplings

Bou-Rabee, Nawaf and Mitra, Siddharth and Wibisono, Andre , journal=. Tail-Sensitive KL and R\'enyi Convergence of Unadjusted Hamiltonian Monte Carlo via One-Shot Couplings

[57] [57]

Bou-Rabee, Nawaf and Sanz-Serna, J. M. , year=. Acta Numerica , publisher=

[58] [58]

2006 , PAGES =

Hairer, Ernst and Lubich, Christian and Wanner, Gerhard , TITLE =. 2006 , PAGES =

2006

[59] [59]

Wasserstein

Zhang, Kelvin Shuangjian and Peyr. Wasserstein. Proceedings of Thirty Third Conference on Learning Theory , pages =. 2020 , volume =

2020

[60] [60]

Proceedings of The 36th International Conference on Algorithmic Learning Theory , pages =

High-accuracy sampling from constrained spaces with the Metropolis-adjusted Preconditioned Langevin Algorithm , author =. Proceedings of The 36th International Conference on Algorithmic Learning Theory , pages =. 2025 , volume =

2025

[61] [61]

Simulating Hamiltonian Dynamics , publisher=

Leimkuhler, Benedict and Reich, Sebastian , year=. Simulating Hamiltonian Dynamics , publisher=

[62] [62]

Journal of the ACM , volume=

Acceleration by stepsize hedging: Multi-step descent and the silver stepsize schedule , author=. Journal of the ACM , volume=. 2025 , publisher=

2025

[63] [63]

Mathematical Programming , volume=

Acceleration by stepsize hedging: Silver Stepsize Schedule for smooth convex optimization , author=. Mathematical Programming , volume=. 2025 , publisher=

2025

[64] [64]

Advances in Neural Information Processing Systems , volume=

Accelerated mirror descent in continuous and discrete time , author=. Advances in Neural Information Processing Systems , volume=

[65] [65]

Dissipativity Theory for

Bin Hu and Laurent Lessard , booktitle =. Dissipativity Theory for. 2017 , volume =

2017

[66] [66]

arXiv preprint arXiv:1809.05042 , year=

Hamiltonian descent methods , author=. arXiv preprint arXiv:1809.05042 , year=

Pith/arXiv arXiv

[67] [67]

A Dynamical Systems Perspective on

Muehlebach, Michael and Jordan, Michael , booktitle =. A Dynamical Systems Perspective on. 2019 , volume =

2019

[68] [68]

Advances in Neural Information Processing Systems , volume=

Hamiltonian descent for composite objectives , author=. Advances in Neural Information Processing Systems , volume=

[69] [69]

Conference on Learning Theory , pages=

Near-optimal methods for minimizing star-convex functions and beyond , author=. Conference on Learning Theory , pages=. 2020 , organization=

2020

[70] [70]

Mathematical Programming , pages=

Understanding the acceleration phenomenon via high-resolution differential equations , author=. Mathematical Programming , pages=. 2022 , publisher=

2022

[71] [71]

International Conference on Machine Learning , pages=

Accelerated stochastic optimization methods under quasar-convexity , author=. International Conference on Machine Learning , pages=. 2023 , organization=

2023

[72] [72]

International Conference on Learning and Representations , year=

Continuized acceleration for quasar convex functions in non-convex optimization , author=. International Conference on Learning and Representations , year=

[73] [73]

2022 , publisher=

Attouch, Hedy and Chbani, Zaki and Fadili, Jalal and Riahi, Hassan , journal=. 2022 , publisher=

2022

[74] [74]

Foundations and Trends in Optimization , volume=

Acceleration methods , author=. Foundations and Trends in Optimization , volume=. 2021 , publisher=

2021

[75] [75]

SIAM Journal on Optimization , volume =

Grimmer, Benjamin , title =. SIAM Journal on Optimization , volume =

[76] [76]

Mathematical Programming , volume=

Performance of first-order methods for smooth convex minimization: a novel approach , author=. Mathematical Programming , volume=. 2014 , publisher=

2014

[77] [77]

A geometric alternative to

Bubeck, S. A geometric alternative to. arXiv preprint arXiv:1506.08187 , year=

Pith/arXiv arXiv

[78] [78]

SIAM Journal on Optimization , volume=

An optimal first order method based on optimal quadratic averaging , author=. SIAM Journal on Optimization , volume=. 2018 , publisher=

2018

[79] [79]

Linear Convergence of Gradient and Proximal-Gradient Methods Under the Polyak- ojasiewicz Condition

Karimi, Hamed and Nutini, Julie and Schmidt, Mark. Linear Convergence of Gradient and Proximal-Gradient Methods Under the Polyak- ojasiewicz Condition. Machine Learning and Knowledge Discovery in Databases. 2016

2016

[80] [80]

arXiv preprint arXiv:1303.4645 , year=

Gradient methods for convex minimization: better rates under weaker conditions , author=. arXiv preprint arXiv:1303.4645 , year=

Pith/arXiv arXiv