pith. machine review for the scientific record. sign in

arxiv: 2605.12819 · v1 · submitted 2026-05-12 · 🧮 math.OC · cs.NA· math.NA

Recognition: unknown

Accuracy and Relationships of Quadratic Models in Derivative-free Optimization

Authors on Pith no claims yet

Pith reviewed 2026-05-14 19:38 UTC · model grok-4.3

classification 🧮 math.OC cs.NAmath.NA
keywords derivative-free optimizationquadratic modelserror boundsmodel accuracyHessian approximationsample set geometrytrust-region methods
0
0 comments X

The pith

Three quadratic models in derivative-free optimization all satisfy fully linear error bounds without assuming bounded model Hessians.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper studies the minimum norm, minimum Frobenius norm, and quadratic generalized simplex derivative models commonly used to build local approximations when gradients are unavailable. It proves each model delivers fully linear error bounds on the function value and gradient, dropping the uniform bound on the model Hessian that earlier analyses needed for the minimum norm case and supplying the first such bounds for the quadratic generalized simplex model. The work further shows that, when the sample set obeys a mild geometric condition, all three models achieve fully quadratic accuracy along the sample directions through directional error bounds on the Hessian. This establishes relationships among the models and identifies when they coincide.

Core claim

We establish fully linear error bounds for the minimum norm, minimum Frobenius norm, and quadratic generalized simplex derivative models, removing the uniformly bounded model Hessian assumption required in existing minimum norm analyses and deriving the first such results for the quadratic generalized simplex model. We further analyze Hessian approximation accuracy via directional error bounds, showing that all three models achieve fully quadratic accuracy along sample directions under a mild condition on the sample set. This reveals a form of directional fully quadratic accuracy not captured by existing theory. Finally, we characterize the relationships among these models, identifying when,

What carries the argument

The minimum norm, minimum Frobenius norm, and quadratic generalized simplex derivative quadratic models together with the fully linear error bounds and directional fully quadratic accuracy they satisfy under standard twice-differentiability and a mild sample-set geometry condition.

If this is right

  • Derivative-free algorithms using any of the three models inherit linear convergence guarantees under standard assumptions on the objective.
  • The directional quadratic accuracy result implies that Hessian information is recovered accurately in the span of the samples even when global quadratic accuracy fails.
  • The identified coincidence conditions allow practitioners to replace one model with another without changing the theoretical properties.
  • Removing the bounded-Hessian requirement extends the class of problems for which existing convergence theory applies.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The directional accuracy may explain why some models perform well on problems whose effective dimension is lower than the ambient space.
  • Similar directional bounds could be derived for models built from noisy function values.
  • The coincidence conditions suggest a hierarchy among the models that could guide automatic model selection inside an optimizer.

Load-bearing premise

The objective function is twice continuously differentiable and the sample set satisfies the stated mild geometric condition.

What would settle it

A twice continuously differentiable function and sample set meeting the mild geometric condition for which the interpolation error of one model grows faster than linear in the trust-region radius.

read the original abstract

We study three quadratic models in model-based derivative-free optimization: the minimum norm (MN), minimum Frobenius norm (MFN), and quadratic generalized simplex derivative (QS) models. Despite their widespread use, their approximation accuracy and relationships have not been systematically explored. We establish fully linear error bounds for all three models, removing the uniformly bounded model Hessian assumption required in existing MN analyses and deriving the first such results for the QS model. We further analyze Hessian approximation accuracy via directional error bounds, showing that all three models achieve fully quadratic accuracy along sample directions under a mild condition on the sample set. This reveals a form of directional fully quadratic accuracy not captured by existing theory. Finally, we characterize the relationships among these models, identifying conditions under which they coincide and clarifying their structural connections.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript analyzes three quadratic models in model-based derivative-free optimization: the minimum-norm (MN), minimum-Frobenius-norm (MFN), and quadratic generalized simplex (QS) models. It derives fully linear error bounds for all three without assuming a uniformly bounded model Hessian (removing this assumption from prior MN analyses and providing the first such bounds for QS), establishes directional fully quadratic accuracy along sample directions under a mild sample-set geometry condition, and characterizes the algebraic relationships and coincidence conditions among the three models.

Significance. If the stated bounds hold, the work strengthens the theoretical basis for these widely used models by relaxing a restrictive assumption present in existing MN analyses and supplying the first fully linear results for QS. The directional fully quadratic accuracy result is a useful refinement that highlights accuracy along interpolation directions even when global quadratic accuracy is not achieved. The model-relationship section clarifies structural connections that can inform practical model selection in DFO algorithms.

major comments (2)
  1. [§3.1, Theorem 3.3] §3.1, Theorem 3.3 (MN fully linear bound): the derivation removes the uniform Hessian bound by using only the sample-set geometry constant; however, the resulting error constant appears to grow with the inverse of the smallest singular value of the interpolation matrix, and it is not shown whether this growth remains controlled under the same mild condition used later for directional accuracy.
  2. [§4.2, Theorem 4.1] §4.2, Theorem 4.1 (directional quadratic accuracy): the mild geometric condition is stated only in terms of the sample directions; a quantitative comparison to the standard well-poisedness constant (e.g., the constant appearing in Eq. (2.7) of the cited DFO literature) is needed to confirm that the condition is indeed strictly weaker than the one required for global quadratic accuracy.
minor comments (2)
  1. [Introduction] The abstract claims 'the first such results for the QS model,' but the introduction should explicitly cite the most recent MN analysis that imposed the bounded-Hessian assumption so readers can verify the novelty claim.
  2. [§3] Notation for the interpolation matrix and its poisedness measure is introduced in §2 but used without re-statement in the proofs of §3; adding a short reminder of the definition before each major theorem would improve readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thorough review and positive recommendation for minor revision. We address the major comments below and will incorporate the suggested clarifications in the revised manuscript.

read point-by-point responses
  1. Referee: [§3.1, Theorem 3.3] §3.1, Theorem 3.3 (MN fully linear bound): the derivation removes the uniform Hessian bound by using only the sample-set geometry constant; however, the resulting error constant appears to grow with the inverse of the smallest singular value of the interpolation matrix, and it is not shown whether this growth remains controlled under the same mild condition used later for directional accuracy.

    Authors: The geometry constant used in the bound of Theorem 3.3 is defined to be the reciprocal of the smallest singular value of the (scaled) interpolation matrix, as per the standard definition in the DFO literature. The mild condition in Section 4 ensures that this constant is bounded by a quantity depending only on the dimension and the parameter δ in the condition, independent of the particular sample set as long as it satisfies the directional spanning requirement. Therefore, the error constant remains controlled. We will add a sentence in the discussion following Theorem 3.3 to make this explicit. revision: partial

  2. Referee: [§4.2, Theorem 4.1] §4.2, Theorem 4.1 (directional quadratic accuracy): the mild geometric condition is stated only in terms of the sample directions; a quantitative comparison to the standard well-poisedness constant (e.g., the constant appearing in Eq. (2.7) of the cited DFO literature) is needed to confirm that the condition is indeed strictly weaker than the one required for global quadratic accuracy.

    Authors: We agree that providing a quantitative comparison would be helpful. The standard well-poisedness constant requires the interpolation set to be poised for quadratic interpolation, which involves bounding the Lebesgue constant or singular values for the full quadratic basis. Our condition only requires that the directions are such that their convex hull contains a ball, which ensures the linear terms are well-conditioned but allows for poorer conditioning in quadratic terms. This is strictly weaker, as one can construct sets satisfying our condition but with arbitrarily large well-poisedness constant. We will add this comparison, including a brief proof sketch, to the revised version of Section 4.2. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected in derivation chain

full rationale

The paper establishes fully linear error bounds for MN, MFN, and QS quadratic models and directional fully quadratic accuracy using standard twice-differentiability of the objective plus a mild geometric condition on the sample set (controlling the interpolation matrix). These rest on external smoothness assumptions and well-poisedness-style geometry constants from existing DFO theory, without reducing any central claim to a fitted parameter, self-definition, or load-bearing self-citation chain. Model relationships are characterized by direct algebraic comparison under stated conditions. No step equates a prediction to its own input by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The analysis rests on standard twice-continuous differentiability of the objective and geometric regularity of the sample set; no free parameters or new entities are introduced.

axioms (2)
  • domain assumption The objective function is twice continuously differentiable
    Required to define quadratic models and derive linear error bounds.
  • domain assumption The sample set satisfies a mild geometric condition
    Invoked for the directional fully quadratic accuracy result.

pith-pipeline@v0.9.0 · 5431 in / 1265 out tokens · 44826 ms · 2026-05-14T19:38:26.298908+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

24 extracted references · 1 canonical work pages · 1 internal anchor

  1. [1]

    A matrix algebra approach to approximate Hessians , author=. IMA J. Numer. Anal. , volume=. 2024 , publisher=

  2. [2]

    2023 , publisher=

    An introduction to convexity, optimization, and algorithms , author=. 2023 , publisher=

  3. [3]

    Calculus identities for generalized simplex gradients: rules and applications , author=. SIAM J. Optim. , volume=. 2020 , publisher=

  4. [4]

    Q-fully quadratic modeling and its application in a random subspace derivative-free method , author=. Comput. Optim. Appl. , volume=. 2024 , publisher=

  5. [5]

    and Hare, W

    Chen, Y. and Hare, W. , title =. IMA J. Numer. Anal. , year =

  6. [6]

    A general framework for floating point error analysis of first-order simplex derivatives , author=. Optim. Methods Softw. , year=

  7. [7]

    2009 , publisher=

    Introduction to derivative-free optimization , author=. 2009 , publisher=

  8. [8]

    Powell, M. J. D. , year =. Beyond Symmetric. Math. Program. , volume =

  9. [9]

    2014 , journal =

    Sobolev Seminorm of Quadratic Functions with Applications to Derivative-Free Optimization , author =. 2014 , journal =

  10. [10]

    Nonlinear Optimization and Applications , pages=

    An algorithm using quadratic interpolation for unconstrained derivative free optimization , author=. Nonlinear Optimization and Applications , pages=. 1996 , publisher=

  11. [11]

    7th AIAA/USAF/NASA/ISSMO Symposium on Multidisciplinary Analysis and Optimization , year=

    A derivative free optimization algorithm in practice , author=. 7th AIAA/USAF/NASA/ISSMO Symposium on Multidisciplinary Analysis and Optimization , year=

  12. [12]

    2008 , journal =

    Geometry of Sample Sets in Derivative-Free Optimization: Polynomial Regression and Underdetermined Interpolation , author =. 2008 , journal =

  13. [13]

    Introduction to Interpolation-Based Optimization , author=

  14. [14]

    CLARSTA: A random subspace trust-region algorithm for convex-constrained derivative-free optimization

    Chen, Y. and Hare, W. and Wiebe, A. , year=. 2506.20335 , archivePrefix=

  15. [15]

    Jarry-Bolduc, Gabriel and Planiden, Chayne , title =. IMA J. Numer. Anal. , year =

  16. [16]

    2026 , archivePrefix=

    Relationships between full-space and subspace quadratic interpolation models and simplex derivatives , author=. 2026 , archivePrefix=

  17. [17]

    Scalable subspace methods for derivative-free nonlinear least-squares optimization , author=. Math. Program. , volume=. 2023 , publisher=

  18. [18]

    Error bounds for overdetermined and underdetermined generalized centred simplex gradients , author=. IMA J. Numer. Anal. , volume=. 2022 , publisher=

  19. [19]

    and Hare, W

    Audet, C. and Hare, W. , title=. Numerical Nonsmooth Optimization , year=

  20. [20]

    The Error in Multivariate Linear Extrapolation with Applications to Derivative-Free Optimization , author =

  21. [21]

    2017 , publisher=

    Derivative-free and blackbox optimization , author=. 2017 , publisher=

  22. [22]

    Using sampling and simplex derivatives in pattern search methods , author=. SIAM J. Optim. , volume=. 2007 , publisher=

  23. [23]

    Using simplex gradients of nonsmooth functions in direct search methods , author=. IMA J. Numer. Anal. , volume=. 2008 , publisher=

  24. [24]

    The calculus of simplex gradients , author=. Optim. Lett. , volume=. 2015 , publisher=