The Method of Ellipcenters for Strongly Convex Functions

Yunier Bello-Cruz

arxiv: 2604.21132 · v2 · submitted 2026-04-22 · 🧮 math.OC

The Method of Ellipcenters for Strongly Convex Functions

Yunier Bello-Cruz This is my paper

Pith reviewed 2026-05-12 01:08 UTC · model grok-4.3

classification 🧮 math.OC

keywords Method of Ellipcentersstrongly convex optimizationLipschitz gradientconvergence rategradient descentexact line searchlevel-set symmetry

0 comments

The pith

The Method of Ellipcenters matches the convergence rate of gradient descent with exact line search for strongly convex functions with Lipschitz gradients.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper extends the Method of Ellipcenters from quadratic problems to the wider class of strongly convex functions whose gradients are Lipschitz continuous. It establishes that the two-gradient-per-iteration procedure, which evaluates the gradient at the current point and at a second point lying on the same level set, produces exactly the same contraction factor per step as gradient descent with an exact line search. When the two gradient vectors are linearly independent, a simple midpoint construction that uses the symmetry of the level set yields an additional strict decrease in the objective value, and this improvement becomes uniform across iterations provided the angle between the gradients remains bounded away from zero. In two dimensions the method reaches the exact minimizer after at most two steps. The claims are illustrated by numerical runs on regularized logistic regression.

Core claim

The Method of Ellipcenters evaluates the gradient once at the current iterate and once at a companion point chosen on the same level set. For any strongly convex function with Lipschitz gradient, this pair of directions produces a reduction in the objective that is at least as large as the reduction obtained by gradient descent with an optimally chosen step length. When the two gradients are linearly independent, the midpoint of the two points lies strictly inside the level set and therefore gives a further decrease; if the angle between the gradients is bounded away from zero, the extra decrease is uniform. In two dimensions the procedure terminates after at most two iterations.

What carries the argument

Method of Ellipcenters (ME), which performs two gradient evaluations per iteration at points sharing the same level-set value and uses their linear independence together with level-set symmetry to obtain a midpoint improvement.

If this is right

ME supplies an implementable surrogate for exact line search that never requires a one-dimensional minimization.
In two-dimensional problems the method reaches the solution after a fixed number of steps independent of starting point.
Whenever the angle between successive gradient pairs stays bounded away from zero, the extra midpoint reduction accumulates into a strictly better global rate.
The same two-gradient construction applies directly to regularized logistic regression and other smooth strongly convex models.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The level-set midpoint argument could be reused inside other first-order methods that already compute two gradients per step.
The uniform-angle condition might be replaced by a weaker average-angle condition that still guarantees a positive fraction of the extra decrease.
In high dimensions the probability that two random gradients are nearly collinear may be small, suggesting that the midpoint improvement occurs most of the time in practice.

Load-bearing premise

The objective functions are strongly convex and have Lipschitz continuous gradients.

What would settle it

A concrete strongly convex function with Lipschitz gradient on which the Method of Ellipcenters either fails to match the per-step contraction of exact-line-search gradient descent or requires more than two iterations in two dimensions.

Figures

Figures reproduced from arXiv: 2604.21132 by Yunier Bello-Cruz.

**Figure 1.** Figure 1: Convergence on (15) with n = 1500, m = 750, and κ = 100. ME uses fewer gradient evaluations than GD-exact, consistent with the improved rate of Theorem 4.4. Fast-GD converges non-monotonically due to the Nesterov momentum and requires knowledge of L. 8 Concluding Remarks We have extended ME from the quadratic setting of [2] to L-smooth µ-strongly convex functions. The companion point y k exists and is uniq… view at source ↗

read the original abstract

The Method of Ellipcenters (ME), introduced in~\cite{ME2025} for strongly convex quadratic minimization, uses two gradient evaluations per iteration: one at the current iterate and one at a companion point on the same level set. We extend ME to the broader class of strongly convex functions with Lipschitz continuous gradient, and prove that ME matches the convergence rate of gradient descent with exact line search on this class. When the two gradient directions are linearly independent, a midpoint argument exploiting the level-set symmetry yields a further per-step improvement, which is global when the angle between the two gradients is uniformly bounded away from zero. ME also converges in at most two steps in dimension two. Numerical experiments on regularized logistic regression confirm the theoretical predictions.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This extends the ellipcenter method from quadratics to strongly convex functions with Lipschitz gradients and shows it matches the linear rate of exact-line-search gradient descent, plus a conditional midpoint gain and two-step termination in 2D.

read the letter

The paper takes the ellipcenter method, previously limited to quadratics, and carries it over to the standard class of strongly convex functions with Lipschitz gradients. It proves the convergence rate matches gradient descent with exact line search, which follows from the usual linear-rate analysis on that class. When the two gradients are linearly independent, a midpoint step that exploits level-set symmetry gives an extra per-iteration reduction; this is global only if the angle between them stays bounded away from zero. In dimension two the method terminates in at most two steps. The logistic-regression experiments line up with the predicted rates.

Referee Report

0 major / 2 minor

Summary. The manuscript extends the Method of Ellipcenters (ME), previously introduced for quadratic objectives, to the class of strongly convex functions with Lipschitz continuous gradients. It proves that ME achieves the same linear convergence rate as gradient descent with exact line search. When the two gradient directions are linearly independent, a midpoint argument based on level-set symmetry yields an additional per-step improvement that is global under a uniform lower bound on the angle between gradients. The method is also shown to terminate in at most two steps in dimension two. These theoretical claims are illustrated by numerical experiments on regularized logistic regression.

Significance. If the proofs are correct, the work supplies a first-order method whose convergence rate matches a standard benchmark while incorporating a symmetry-based refinement that can accelerate progress when gradients are independent. The finite-termination result in two dimensions and the global improvement under an angle condition are attractive features. The numerical validation on logistic regression provides concrete support for the analysis. The extension from the quadratic case in the cited ME2025 work is a direct and useful generalization.

minor comments (2)

Abstract: the claim that ME 'matches the convergence rate' would be clearer if it briefly recalled the explicit rate (in terms of the condition number) or cited the standard result for exact-line-search gradient descent on this function class.
The manuscript should include a short remark on how the Lipschitz-gradient assumption is used in the level-set symmetry argument, to make the midpoint improvement transparent to readers familiar with standard analyses.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the careful reading and the positive assessment of our manuscript. We appreciate the recommendation for minor revision. The referee summary and significance section accurately describe the contributions, including the extension of the Method of Ellipcenters, the convergence rate matching gradient descent with exact line search, the additional improvement when gradients are independent, the finite termination in 2D, and the numerical experiments. As no specific major comments were raised, we do not have any points requiring response or revision at this stage. Should the referee have any minor suggestions or clarifications, we are happy to incorporate them in a revised version.

Circularity Check

0 steps flagged

Minor self-citation to prior work on quadratics; central extension and rate proof remain independently derived

full rationale

The paper cites ME2025 solely for the original definition of the ellipcenter method on quadratics. The extension to strongly convex functions with Lipschitz gradients, the proof that ME matches the linear convergence rate of exact-line-search gradient descent, the midpoint improvement when gradients are linearly independent, and the two-step termination in dimension two are all obtained directly from the level-set symmetry and standard properties of the function class under the paper's maintained assumptions. No self-definitional reduction, fitted input renamed as prediction, or load-bearing self-citation chain appears in the derivation; the central claims retain independent content.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The analysis rests on the standard assumptions of strong convexity and Lipschitz continuity of the gradient, which are common in the field and not invented for this paper.

axioms (2)

domain assumption The objective function is strongly convex
Invoked to define the function class and enable convergence analysis.
domain assumption The gradient is Lipschitz continuous
Required to bound the convergence rate and match it to gradient descent with line search.

pith-pipeline@v0.9.0 · 5412 in / 1417 out tokens · 81080 ms · 2026-05-12T01:08:01.448177+00:00 · methodology

Review history (2 revisions) →

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

midpoint argument exploiting the level-set symmetry yields a further per-step improvement... f(y_k)=f(x_k)
IndisputableMonolith/Foundation/AlphaCoordinateFixation.lean J_uniquely_calibrated_via_higher_derivative unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Baillon–Haddad descent... orthogonality of ∇f(x_{k+1}) to the step

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

5 extracted references · 5 canonical work pages

[1]

and Borwein, J

Barzilai, J. and Borwein, J. M. Two-point step size gradient methods.IMA Journal of Numerical Analysis, 8(1):141–148, 1988

work page 1988
[2]

F., and Guigues, V

Behling, R., Corrˆ ea Aquines, R., Zanatta, E. F., and Guigues, V. Introducing the method of ellipcenters, a new first-order technique for unconstrained optimization.Preprint, arXiv, 2025

work page 2025
[3]

A method of solving a convex programming problem with convergence rate O(1/k2).Soviet Mathematics Doklady, 27(2):372–376, 1983

Nesterov, Y. A method of solving a convex programming problem with convergence rate O(1/k2).Soviet Mathematics Doklady, 27(2):372–376, 1983

work page 1983
[4]

Kluwer Aca- demic Publishers, Dordrecht, 2004

Nesterov, Y.Introductory Lectures on Convex Optimization: A Basic Course. Kluwer Aca- demic Publishers, Dordrecht, 2004

work page 2004
[5]

Polyak, B. T. Gradient methods for the minimisation of functionals.USSR Computational Mathematics and Mathematical Physics, 3(4):864–878, 1963. 11

work page 1963

[1] [1]

and Borwein, J

Barzilai, J. and Borwein, J. M. Two-point step size gradient methods.IMA Journal of Numerical Analysis, 8(1):141–148, 1988

work page 1988

[2] [2]

F., and Guigues, V

Behling, R., Corrˆ ea Aquines, R., Zanatta, E. F., and Guigues, V. Introducing the method of ellipcenters, a new first-order technique for unconstrained optimization.Preprint, arXiv, 2025

work page 2025

[3] [3]

A method of solving a convex programming problem with convergence rate O(1/k2).Soviet Mathematics Doklady, 27(2):372–376, 1983

Nesterov, Y. A method of solving a convex programming problem with convergence rate O(1/k2).Soviet Mathematics Doklady, 27(2):372–376, 1983

work page 1983

[4] [4]

Kluwer Aca- demic Publishers, Dordrecht, 2004

Nesterov, Y.Introductory Lectures on Convex Optimization: A Basic Course. Kluwer Aca- demic Publishers, Dordrecht, 2004

work page 2004

[5] [5]

Polyak, B. T. Gradient methods for the minimisation of functionals.USSR Computational Mathematics and Mathematical Physics, 3(4):864–878, 1963. 11

work page 1963