arxiv: 2605.11421 · v1 · submitted 2026-05-12 · 🧮 math.OC

Recognition: no theorem link

The Grimmer-Shu-Wang Certificate and the Drori-Teboulle Minimax Nonnegative Constant-Stepsize Bound for N >= 3

Lixing Zhang

Authors on Pith no claims yet

Pith reviewed 2026-05-13 02:37 UTC · model grok-4.3

classification 🧮 math.OC

keywords gradient descentperformance estimationminimax analysisconstant stepsizelow-rank certificateresidual systemworst-case boundsconvergence rate

0 comments

The pith

For every number of steps N at least 3, positive vectors exist that satisfy the equations of the strengthened low-rank certificate for the worst-case analysis of gradient descent with constant stepsize.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper shows that the strengthened low-rank performance-estimation certificate can be built for the problem of determining the minimax value of gradient descent performance under nonnegative constant stepsizes, for any horizon length N of three or more. It constructs the certificate by finding positive coefficients a, b, c, d that make all residual equations hold. A sympathetic reader would care because this confirms the upper bound on the worst-case contraction factor, and when combined with matching lower bounds from simple function examples, gives the exact value of that minimax quantity. The proof relies on reducing the system, an existence argument on a simplex, and verifying positivity through convolution of squared tails.

Core claim

The central claim is that for every N greater than or equal to 3, letting rho_N be the unique number in (0,1) solving rho_N to the power 2N times (2N rho_N plus 2N plus 1) equals 1, the certificate equations admit positive vectors a, b, c, d satisfying all residual equations. This is shown via a reduced residual system in the d variables, a simplex existence argument for a positive reduced zero, a terminal residual completion identity, and a tail-square convolution argument proving the cumulative margins that force positivity of the certificate coefficients. As a result, the certificate exists and yields the upper bound for the minimax problem.

What carries the argument

The reduced residual system in the d variables, combined with the terminal residual completion identity and the tail-square convolution that proves the cumulative margins ensuring positivity.

If this is right

The certificate yields the upper bound on the minimax nonnegative constant-stepsize value for gradient descent.
Combined with the one-dimensional quadratic and Huber lower-bound examples, this establishes the exact minimax value for every N >= 3.
The exact value is rho_N, the unique solution to the equation rho_N to the power 2N times (2N rho_N plus 2N plus 1) equals 1.
The construction works uniformly for all horizons of three steps or more.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the construction generalizes, similar certificates might certify exact bounds for other optimization methods or stepsize rules.
The tail-square convolution technique could apply to proving positivity in other systems of inequalities arising in performance estimation.
With the certificate in hand, one could compute the exact rate numerically for specific large N by solving for rho_N rather than the full problem.
The existence for all N suggests the bound is tight without needing case-by-case verification beyond the given proof structure.

Load-bearing premise

The reduced residual system has a positive zero inside the simplex for the specific rho_N defined by the given polynomial equation.

What would settle it

An explicit computation for some N >= 3 where the tail-square convolution yields a negative cumulative margin, violating positivity of one of the vectors a, b, c, or d.

read the original abstract

We prove, for every horizon N >= 3, the existence of the strengthened low-rank performance-estimation certificate proposed by Grimmer, Shu, and Wang for the Drori-Teboulle minimax nonnegative constant-stepsize problem for gradient descent. Let rho_N in (0,1) be the unique solution of rho_N^{2N}(2N rho_N+2N+1)=1. We show that the GSW certificate equations admit positive vectors a, b, c, d satisfying all residual equations. The proof proceeds through a reduced residual system in the variables d, a simplex existence argument for a positive reduced zero, a terminal residual completion identity, and a tail-square convolution argument proving the cumulative margins that force positivity of the certificate coefficients. Consequently, the GSW low-rank PEP certificate exists for every N >= 3 and yields the Drori-Teboulle upper bound. Together with the one-dimensional quadratic and Huber lower-bound examples, this establishes the Drori-Teboulle minimax nonnegative constant-stepsize value for gradient descent for every N >= 3.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Zhang closes the gap by proving the GSW certificate exists for every N >= 3, which pins down the Drori-Teboulle minimax bound for constant-stepsize gradient descent.

read the letter

This paper proves that the strengthened low-rank GSW certificate exists for every finite horizon N at least 3. That finishes the upper bound side of the Drori-Teboulle characterization for nonnegative constant-stepsize gradient descent, once the one-dimensional quadratic and Huber lower bounds are already in place. The result is specific and directly useful inside the performance estimation literature for first-order methods. The proof is constructive. It reduces the full set of residual equations to a smaller system in the vector d, invokes a simplex-existence argument to guarantee a positive reduced zero, applies a terminal residual completion identity, and then uses a tail-square convolution to show that the cumulative margins stay positive and force a, b, c, d all positive. Rho_N is defined explicitly as the unique root in (0,1) of the equation rho_N to the 2N times (2N rho_N plus 2N plus 1) equals 1, chosen so the identities close without fitting. The steps avoid obvious circularity and build from the residual equations plus independent analytic arguments. The load-bearing pieces are the simplex argument and the convolution for the margins. If those hold for arbitrary N, the claim follows. The write-up presents them as general, and the stress-test description shows no internal inconsistency or missing case. A referee would still want explicit verification for a few small N to confirm the reductions do not hide edge behavior. This work is for readers already tracking the Drori-Teboulle line and the recent GSW certificates. It supplies the missing general-N existence result without introducing new free parameters or invented entities. I would send it to peer review; the claim is narrow enough to be checked and the proof chain is laid out clearly enough to be examined.

Referee Report

0 major / 2 minor

Summary. The manuscript proves that for every integer N ≥ 3 the strengthened low-rank GSW performance-estimation certificate exists for the Drori-Teboulle minimax nonnegative constant-stepsize problem for gradient descent. It defines ρ_N ∈ (0,1) as the unique solution of the scalar equation ρ_N^{2N}(2N ρ_N + 2N + 1) = 1 and shows that the GSW residual equations admit strictly positive vectors a, b, c, d by reducing the system to one in the vector d, invoking a simplex-existence argument to obtain a positive reduced zero, applying a terminal residual completion identity, and using a tail-square convolution to establish cumulative margins that force positivity of all certificate coefficients. The resulting certificate yields the Drori-Teboulle upper bound, which together with the one-dimensional quadratic and Huber lower-bound constructions determines the exact minimax value for every N ≥ 3.

Significance. If the claimed existence holds, the result is significant: it supplies an explicit, constructive, parameter-free certificate that closes the finite-horizon gap in the Drori-Teboulle problem for all N ≥ 3. The proof chain (reduced residual system, simplex argument, terminal identity, tail-square convolution) is analytic rather than numerical and directly produces the matching upper bound, thereby completing the characterization of the minimax nonnegative constant-stepsize value for gradient descent. This strengthens the performance-estimation framework by furnishing a verifiable low-rank certificate whose construction is independent of the target bound.

minor comments (2)

[tail-square convolution argument] The tail-square convolution argument is central to positivity; a brief explicit formula for the convolution together with a one-line verification for N=3 would improve readability without lengthening the proof.
[simplex existence argument] The simplex-existence step for the positive reduced zero would benefit from a short reference to the precise theorem or corollary invoked (e.g., Brouwer or a variant of the intermediate-value theorem on the simplex).

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the careful reading of the manuscript and the positive recommendation to accept. The referee's summary accurately captures the main result: the existence of the strengthened low-rank GSW certificate for every N ≥ 3, together with the explicit construction via the reduced residual system, simplex argument, terminal identity, and tail-square convolution that yields the Drori-Teboulle upper bound.

Circularity Check

0 steps flagged

Derivation is self-contained via independent analytic arguments

full rationale

The paper constructs the GSW certificate existence for every N >= 3 through an explicit chain: reduction of residual equations to a system in vector d, a simplex-existence argument for a positive reduced zero, a terminal residual completion identity, and a tail-square convolution proving cumulative margins that enforce positivity of a, b, c, d. rho_N is defined directly as the unique root of the explicit equation rho_N^{2N}(2N rho_N + 2N + 1) = 1 chosen to close the identities; the positivity proof then holds analytically for this choice without data-fitting or redefinition of the target bound. No load-bearing self-citations appear, the lower-bound examples (quadratic and Huber) are independent, and the central claim does not reduce to its inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The claim rests on the existence of a unique positive rho_N solving the given polynomial-like equation and on the certificate residual system from prior GSW work; no free parameters or new entities are introduced.

axioms (2)

domain assumption For every integer N >= 3 the equation rho_N^{2N}(2N rho_N + 2N + 1) = 1 admits a unique solution rho_N in (0,1).
Stated directly in the abstract as the definition of the bound value.
standard math Standard real-analysis results on existence of positive solutions to linear systems and positivity preservation under convolution.
Invoked implicitly by the simplex argument and tail-square convolution steps.

pith-pipeline@v0.9.0 · 5500 in / 1418 out tokens · 75309 ms · 2026-05-13T02:37:04.909515+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

9 extracted references · 9 canonical work pages

[1]

Drori and M

Y. Drori and M. Teboulle. Performance of first-order methods for smooth convex minimization: a novel approach.Mathematical Programming, 145(1–2):451–482, 2014

work page 2014
[2]

A. S. Nemirovski and D. B. Yudin.Problem Complexity and Method Efficiency in Optimization. Wiley-Interscience, 1983

work page 1983
[3]

B. T. Polyak.Introduction to Optimization. Optimization Software, Inc., 1987

work page 1987
[4]

Y. E. Nesterov. A method of solving a convex programming problem with convergence rate O(1/k2).Soviet Mathematics Doklady, 27(2):372–376, 1983

work page 1983
[5]

Nesterov.Introductory Lectures on Convex Optimization: A Basic Course

Y. Nesterov.Introductory Lectures on Convex Optimization: A Basic Course. Kluwer Aca- demic Publishers, 2004

work page 2004
[6]

A. B. Taylor, J. M. Hendrickx, and F. Glineur. Smooth strongly convex interpolation and exact worst-case performance of first-order methods.Mathematical Programming, 161:307–345, 2017

work page 2017
[7]

Kim and J

D. Kim and J. A. Fessler. Optimized first-order methods for smooth convex minimization. Mathematical Programming, 159(1–2):81–107, 2016

work page 2016
[8]

Lessard, B

L. Lessard, B. Recht, and A. Packard. Analysis and design of optimization algorithms via integral quadratic constraints.SIAM Journal on Optimization, 26(1):57–95, 2016

work page 2016
[9]

Grimmer, K

B. Grimmer, K. Shu, and A. L. Wang. A strengthened conjecture on the minimax optimal constant stepsize for gradient descent. arXiv:2407.11739, 2024. 33

work page arXiv 2024