Recognition: no theorem link
The Grimmer-Shu-Wang Certificate and the Drori-Teboulle Minimax Nonnegative Constant-Stepsize Bound for N >= 3
Pith reviewed 2026-05-13 02:37 UTC · model grok-4.3
The pith
For every number of steps N at least 3, positive vectors exist that satisfy the equations of the strengthened low-rank certificate for the worst-case analysis of gradient descent with constant stepsize.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that for every N greater than or equal to 3, letting rho_N be the unique number in (0,1) solving rho_N to the power 2N times (2N rho_N plus 2N plus 1) equals 1, the certificate equations admit positive vectors a, b, c, d satisfying all residual equations. This is shown via a reduced residual system in the d variables, a simplex existence argument for a positive reduced zero, a terminal residual completion identity, and a tail-square convolution argument proving the cumulative margins that force positivity of the certificate coefficients. As a result, the certificate exists and yields the upper bound for the minimax problem.
What carries the argument
The reduced residual system in the d variables, combined with the terminal residual completion identity and the tail-square convolution that proves the cumulative margins ensuring positivity.
If this is right
- The certificate yields the upper bound on the minimax nonnegative constant-stepsize value for gradient descent.
- Combined with the one-dimensional quadratic and Huber lower-bound examples, this establishes the exact minimax value for every N >= 3.
- The exact value is rho_N, the unique solution to the equation rho_N to the power 2N times (2N rho_N plus 2N plus 1) equals 1.
- The construction works uniformly for all horizons of three steps or more.
Where Pith is reading between the lines
- If the construction generalizes, similar certificates might certify exact bounds for other optimization methods or stepsize rules.
- The tail-square convolution technique could apply to proving positivity in other systems of inequalities arising in performance estimation.
- With the certificate in hand, one could compute the exact rate numerically for specific large N by solving for rho_N rather than the full problem.
- The existence for all N suggests the bound is tight without needing case-by-case verification beyond the given proof structure.
Load-bearing premise
The reduced residual system has a positive zero inside the simplex for the specific rho_N defined by the given polynomial equation.
What would settle it
An explicit computation for some N >= 3 where the tail-square convolution yields a negative cumulative margin, violating positivity of one of the vectors a, b, c, or d.
read the original abstract
We prove, for every horizon N >= 3, the existence of the strengthened low-rank performance-estimation certificate proposed by Grimmer, Shu, and Wang for the Drori-Teboulle minimax nonnegative constant-stepsize problem for gradient descent. Let rho_N in (0,1) be the unique solution of rho_N^{2N}(2N rho_N+2N+1)=1. We show that the GSW certificate equations admit positive vectors a, b, c, d satisfying all residual equations. The proof proceeds through a reduced residual system in the variables d, a simplex existence argument for a positive reduced zero, a terminal residual completion identity, and a tail-square convolution argument proving the cumulative margins that force positivity of the certificate coefficients. Consequently, the GSW low-rank PEP certificate exists for every N >= 3 and yields the Drori-Teboulle upper bound. Together with the one-dimensional quadratic and Huber lower-bound examples, this establishes the Drori-Teboulle minimax nonnegative constant-stepsize value for gradient descent for every N >= 3.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proves that for every integer N ≥ 3 the strengthened low-rank GSW performance-estimation certificate exists for the Drori-Teboulle minimax nonnegative constant-stepsize problem for gradient descent. It defines ρ_N ∈ (0,1) as the unique solution of the scalar equation ρ_N^{2N}(2N ρ_N + 2N + 1) = 1 and shows that the GSW residual equations admit strictly positive vectors a, b, c, d by reducing the system to one in the vector d, invoking a simplex-existence argument to obtain a positive reduced zero, applying a terminal residual completion identity, and using a tail-square convolution to establish cumulative margins that force positivity of all certificate coefficients. The resulting certificate yields the Drori-Teboulle upper bound, which together with the one-dimensional quadratic and Huber lower-bound constructions determines the exact minimax value for every N ≥ 3.
Significance. If the claimed existence holds, the result is significant: it supplies an explicit, constructive, parameter-free certificate that closes the finite-horizon gap in the Drori-Teboulle problem for all N ≥ 3. The proof chain (reduced residual system, simplex argument, terminal identity, tail-square convolution) is analytic rather than numerical and directly produces the matching upper bound, thereby completing the characterization of the minimax nonnegative constant-stepsize value for gradient descent. This strengthens the performance-estimation framework by furnishing a verifiable low-rank certificate whose construction is independent of the target bound.
minor comments (2)
- [tail-square convolution argument] The tail-square convolution argument is central to positivity; a brief explicit formula for the convolution together with a one-line verification for N=3 would improve readability without lengthening the proof.
- [simplex existence argument] The simplex-existence step for the positive reduced zero would benefit from a short reference to the precise theorem or corollary invoked (e.g., Brouwer or a variant of the intermediate-value theorem on the simplex).
Simulated Author's Rebuttal
We thank the referee for the careful reading of the manuscript and the positive recommendation to accept. The referee's summary accurately captures the main result: the existence of the strengthened low-rank GSW certificate for every N ≥ 3, together with the explicit construction via the reduced residual system, simplex argument, terminal identity, and tail-square convolution that yields the Drori-Teboulle upper bound.
Circularity Check
Derivation is self-contained via independent analytic arguments
full rationale
The paper constructs the GSW certificate existence for every N >= 3 through an explicit chain: reduction of residual equations to a system in vector d, a simplex-existence argument for a positive reduced zero, a terminal residual completion identity, and a tail-square convolution proving cumulative margins that enforce positivity of a, b, c, d. rho_N is defined directly as the unique root of the explicit equation rho_N^{2N}(2N rho_N + 2N + 1) = 1 chosen to close the identities; the positivity proof then holds analytically for this choice without data-fitting or redefinition of the target bound. No load-bearing self-citations appear, the lower-bound examples (quadratic and Huber) are independent, and the central claim does not reduce to its inputs by construction.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption For every integer N >= 3 the equation rho_N^{2N}(2N rho_N + 2N + 1) = 1 admits a unique solution rho_N in (0,1).
- standard math Standard real-analysis results on existence of positive solutions to linear systems and positivity preservation under convolution.
Reference graph
Works this paper leans on
-
[1]
Y. Drori and M. Teboulle. Performance of first-order methods for smooth convex minimization: a novel approach.Mathematical Programming, 145(1–2):451–482, 2014
work page 2014
-
[2]
A. S. Nemirovski and D. B. Yudin.Problem Complexity and Method Efficiency in Optimization. Wiley-Interscience, 1983
work page 1983
-
[3]
B. T. Polyak.Introduction to Optimization. Optimization Software, Inc., 1987
work page 1987
-
[4]
Y. E. Nesterov. A method of solving a convex programming problem with convergence rate O(1/k2).Soviet Mathematics Doklady, 27(2):372–376, 1983
work page 1983
-
[5]
Nesterov.Introductory Lectures on Convex Optimization: A Basic Course
Y. Nesterov.Introductory Lectures on Convex Optimization: A Basic Course. Kluwer Aca- demic Publishers, 2004
work page 2004
-
[6]
A. B. Taylor, J. M. Hendrickx, and F. Glineur. Smooth strongly convex interpolation and exact worst-case performance of first-order methods.Mathematical Programming, 161:307–345, 2017
work page 2017
- [7]
-
[8]
L. Lessard, B. Recht, and A. Packard. Analysis and design of optimization algorithms via integral quadratic constraints.SIAM Journal on Optimization, 26(1):57–95, 2016
work page 2016
-
[9]
B. Grimmer, K. Shu, and A. L. Wang. A strengthened conjecture on the minimax optimal constant stepsize for gradient descent. arXiv:2407.11739, 2024. 33
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.