$Q$-fully Quadratic Modeling and its Application in a Random Subspace Derivative-free Method

Amy Wiebe; Warren Hare; Yiwen Chen

arxiv: 2312.03169 · v1 · pith:7HBFHEKQnew · submitted 2023-12-05 · 🧮 math.OC

Q-fully Quadratic Modeling and its Application in a Random Subspace Derivative-free Method

Yiwen Chen , Warren Hare , Amy Wiebe This is my paper

Pith reviewed 2026-05-24 05:15 UTC · model grok-4.3

classification 🧮 math.OC

keywords derivative-free optimizationtrust-region methodsrandom subspacesquadratic modelsglobal convergencehigh-dimensional problems

0 comments

The pith

A random-subspace trust-region algorithm using quadratic models converges almost surely to small-gradient points for general objective functions.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a way to build quadratic approximations that satisfy the same error properties as fully quadratic models but inside randomly chosen low-dimensional subspaces. These approximations are then used inside a trust-region method that never needs the full-dimensional gradient or Hessian. The authors prove that the resulting algorithm reaches a point with arbitrarily small gradient with probability one and that the expected number of steps needed is finite. They also run numerical tests on standard problems to compare quadratic and linear models inside the same subspace framework. A reader should care because the approach removes the usual restriction to linear models when applying random subspaces to high-dimensional derivative-free problems.

Core claim

The paper constructs Q-fully quadratic models in randomly selected subspaces and embeds them in a trust-region algorithm that requires no special structure on the objective. Under standard smoothness assumptions, the geometry of the sample sets and the resulting approximation errors remain controlled with high probability in the subspaces. This yields an almost-sure global convergence guarantee together with an explicit upper bound on the expected number of iterations required to drive the gradient norm below any positive threshold.

What carries the argument

The Q-fully quadratic model, a quadratic approximation whose error bounds hold uniformly in randomly chosen subspaces and thereby support trust-region steps without full-space sampling.

If this is right

The algorithm reaches points with arbitrarily small gradient almost surely.
The expected iteration count to reach a given gradient tolerance is finite and explicitly bounded.
Quadratic models can be built and analyzed in subspaces with the same tools used for linear models.
Numerical comparisons reveal cases where quadratic models improve performance over linear ones and cases where they do not.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same subspace model construction could be reused inside other derivative-free frameworks such as direct-search methods.
The iteration bound may become practically useful once the subspace dimension and sampling parameters are tuned for a given problem class.
Extending the analysis to noisy or stochastic objectives would require only modest changes to the error-bound arguments already developed.

Load-bearing premise

Random subspaces preserve the sample-set geometry and quadratic error bounds needed for the model construction to work on arbitrary smooth functions.

What would settle it

Run the algorithm on a smooth high-dimensional test function such as the Rosenbrock function and observe whether the gradient norm fails to drop below a fixed tolerance after more iterations than the derived expectation bound.

Figures

Figures reproduced from arXiv: 2312.03169 by Amy Wiebe, Warren Hare, Yiwen Chen.

**Figure 2.** Figure 2: Performance profiles with τ = 0.5 and τ = 10−3 comparing the runtime [PITH_FULL_IMAGE:figures/full_fig_p027_2.png] view at source ↗

**Figure 3.** Figure 3: Data profiles with τ = 0.5 and τ = 10−3 comparing the runtime. 4.1.2 Results based on function evaluations Figures 4 and 5 present the performance and data profiles based on function evaluations with accuracy levels τ = 0.5 and τ = 10−3 . We can see that when measured by function evaluations, for both low accuracy level τ = 0.5 and high accuracy level τ = 10−3 , solvers based on linear models perform bette… view at source ↗

**Figure 4.** Figure 4: Performance profiles with τ = 0.5 and τ = 10−3 comparing the function evaluations [PITH_FULL_IMAGE:figures/full_fig_p028_4.png] view at source ↗

**Figure 5.** Figure 5: Data profiles with τ = 0.5 and τ = 10−3 comparing the function evaluations. 4.2 Results on nonlinear least-squares problems For nonlinear least-squares problems, there exists another method to construct quadratic models. As shown in [6], we can first construct a subspace linear interpolation model for g, denoted by me g(sb), which interpolates gb(sb) = g(x 0 + Qsb) on the (p + 1) points {0} ∪ {ri : i = 1, … view at source ↗

**Figure 6.** Figure 6: Legend of figures (continued). 4.2.1 Results based on runtime Figures 7 and 8 present the performance and data profiles based on runtime with accuracy levels τ = 0.5 and τ = 10−3 . Figures 7 and 8 show that, for the low accuracy level τ = 0.5, when measured by the runtime, solvers based on square-of-linear models perform better than linear models, but worse than other quadratic models in general. However, … view at source ↗

**Figure 7.** Figure 7: Performance profiles with τ = 0.5 and τ = 10−3 comparing the runtime [PITH_FULL_IMAGE:figures/full_fig_p029_7.png] view at source ↗

**Figure 8.** Figure 8: Data profiles with τ = 0.5 and τ = 10−3 comparing the runtime. 4.2.2 Results based on function evaluations Figures 9 and 10 present the performance and data profiles based on function evaluations with accuracy levels τ = 0.5 and τ = 10−3 . When measured by function evaluations, as shown in Figures 9 and 10, for both accuracy levels τ = 0.5 and τ = 10−3 , the solvers based on square-oflinear models perform… view at source ↗

**Figure 9.** Figure 9: Performance profiles with τ = 0.5 and τ = 10−3 comparing the function evaluations [PITH_FULL_IMAGE:figures/full_fig_p030_9.png] view at source ↗

**Figure 10.** Figure 10: Data profiles with τ = 0.5 and τ = 10−3 comparing the function evaluations. 5 Conclusion In this paper, we proposed QARSTA, a random subspace derivative-free trust-region algorithm for unconstrained deterministic optimization problems. This algorithm extends the previous random subspace trust-region algorithms to a much broader scope by using quadratic approximation models instead of linear models and ge… view at source ↗

**Figure 11.** Figure 11: Performance profiles with τ = 10−1 and τ = 10−2 comparing the runtime [PITH_FULL_IMAGE:figures/full_fig_p032_11.png] view at source ↗

**Figure 12.** Figure 12: Data profiles with τ = 10−1 and τ = 10−2 comparing the runtime. A.1.2 Results based on function evaluations [PITH_FULL_IMAGE:figures/full_fig_p032_12.png] view at source ↗

**Figure 13.** Figure 13: Performance profiles with τ = 10−1 and τ = 10−2 comparing the function evaluations [PITH_FULL_IMAGE:figures/full_fig_p033_13.png] view at source ↗

**Figure 14.** Figure 14: Data profiles with τ = 10−1 and τ = 10−2 comparing the function evaluations. A.2 Results on nonlinear least-squares problems A.2.1 Results based on runtime [PITH_FULL_IMAGE:figures/full_fig_p033_14.png] view at source ↗

**Figure 15.** Figure 15: Performance profiles with τ = 10−1 and τ = 10−2 comparing the runtime [PITH_FULL_IMAGE:figures/full_fig_p034_15.png] view at source ↗

**Figure 16.** Figure 16: Data profiles with τ = 10−1 and τ = 10−2 comparing the runtime. A.2.2 Results based on function evaluations [PITH_FULL_IMAGE:figures/full_fig_p034_16.png] view at source ↗

**Figure 17.** Figure 17: Performance profiles with τ = 10−1 and τ = 10−2 comparing the function evaluations [PITH_FULL_IMAGE:figures/full_fig_p035_17.png] view at source ↗

**Figure 18.** Figure 18: Data profiles with τ = 10−1 and τ = 10−2 comparing the function evaluations. References 1. S. Alarie, C. Audet, A. E. Gheribi, M. Kokkolaras, and S. Le Digabel, Two decades of blackbox optimization applications, EURO J. Comput. Optim., 9 (2021), p. 100011. 2. M. Alzantot, Y. Sharma, S. Chakraborty, H. Zhang, C. Hsieh, and M. B. Srivastava, GenAttack: practical black-box attacks with gradient-free optimiza… view at source ↗

read the original abstract

Model-based derivative-free optimization (DFO) methods are an important class of DFO methods that are known to struggle with solving high-dimensional optimization problems. Recent research has shown that incorporating random subspaces into model-based DFO methods has the potential to improve their performance on high-dimensional problems. However, most of the current theoretical and practical results are based on linear approximation models due to the complexity of quadratic approximation models. This paper proposes a random subspace trust-region algorithm based on quadratic approximations. Unlike most of its precursors, this algorithm does not require any special form of objective function. We study the geometry of sample sets, the error bounds for approximations, and the quality of subspaces. In particular, we provide a technique to construct $Q$-fully quadratic models, which is easy to analyze and implement. We present an almost-sure global convergence result of our algorithm and give an upper bound on the expected number of iterations to find a sufficiently small gradient. We also develop numerical experiments to compare the performance of our algorithm using both linear and quadratic approximation models. The numerical results demonstrate the strengths and weaknesses of using quadratic approximations.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper supplies an explicit construction for Q-fully quadratic models inside random subspaces plus almost-sure convergence for the resulting trust-region method, but the poisedness argument for random projections needs verification.

read the letter

The core advance is a technique to build Q-fully quadratic models that stays inside randomly selected subspaces and does not assume special structure on the objective. They pair it with a trust-region algorithm, prove almost-sure global convergence to a stationary point, and give a bound on the expected number of iterations needed to reach a small gradient. They also run experiments that directly compare the quadratic version against the linear-subspace baseline on the same test problems. That combination of a new model-construction step, the convergence claim, and the head-to-head numerics is what is actually new here. The construction is presented as straightforward to code and analyze, which is useful for people who already work with random-subspace DFO. The numerics are honest enough to flag both strengths and weaknesses of the quadratic choice. The main soft spot is the geometry requirement. Quadratic models need stricter uniform poisedness than linear ones, and the argument must show that a random subspace still produces a sample set whose constants do not blow up with the subspace choice. The abstract states that they study this, but the stress-test concern about the probability of good realizations decaying in high dimension is reasonable and would need to be checked in the proofs. If that step only holds in expectation or with probability that decays, the Borel-Cantelli step for almost-sure convergence could fail. The paper is aimed at the DFO community that already uses random subspaces or wants to move from linear to quadratic models in high dimensions. A reader who cares about model-based methods for black-box problems will get concrete value from the construction and the experiments. It is worth sending to referees because the claims are stated clearly, the numerics are present, and the topic is practically relevant; the referees can then settle whether the poisedness and error-bound arguments close.

Referee Report

2 major / 3 minor

Summary. The paper introduces a random-subspace trust-region algorithm for derivative-free optimization that employs quadratic (Q-fully quadratic) models rather than linear ones. It develops a construction technique for such models, analyzes the geometry of sample sets and approximation error bounds in randomly chosen subspaces, proves an almost-sure global convergence result together with an upper bound on the expected number of iterations needed to reach a sufficiently small gradient norm, and reports numerical comparisons of linear versus quadratic models on test problems. The method is stated to apply to general smooth objectives without special structure.

Significance. If the almost-sure convergence and iteration bound hold, the work would extend model-based DFO theory to quadratic approximations inside random subspaces for arbitrary objectives, addressing a recognized limitation of existing random-subspace methods that rely on linear models. The explicit construction technique for Q-fully quadratic models and the accompanying geometry/error analysis constitute concrete technical contributions that could be reused.

major comments (2)

[§4] §4 (convergence analysis) and the subspace-quality section: the almost-sure convergence argument invokes Borel–Cantelli on the event that each random subspace yields a sufficiently poised sample set for a Q-fully quadratic model. No explicit lower bound on this probability that is independent of ambient dimension n and of the iteration index is supplied; without such a uniform positive lower bound the Borel–Cantelli step does not go through for general smooth f.
[§3] Definition of Q-fully quadratic model and the error-bound statements (around Eq. (3.4)–(3.7)): the claimed O(Δ²) gradient and O(Δ) Hessian error bounds are asserted to hold with constants independent of the random subspace, yet the poisedness constants for the projected point set are allowed to depend on the particular realization of the subspace; this dependence must be controlled uniformly to justify the trust-region step analysis that follows.

minor comments (3)

Notation for the random subspace projection operator is introduced without a dedicated symbol table or consistent use across sections.
[Numerical experiments] Numerical section reports performance on a modest collection of CUTEst problems but does not include high-dimensional instances (n ≫ 100) that would directly test the claimed advantage of the random-subspace quadratic approach.
A few typographical inconsistencies appear in the statement of the trust-region subproblem (e.g., the radius update rule).

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and constructive comments. We address the two major comments point by point below. Both identify gaps in the current presentation of the probability bound and uniformity of constants; we will revise the manuscript to close these gaps.

read point-by-point responses

Referee: [§4] §4 (convergence analysis) and the subspace-quality section: the almost-sure convergence argument invokes Borel–Cantelli on the event that each random subspace yields a sufficiently poised sample set for a Q-fully quadratic model. No explicit lower bound on this probability that is independent of ambient dimension n and of the iteration index is supplied; without such a uniform positive lower bound the Borel–Cantelli step does not go through for general smooth f.

Authors: We agree that an explicit lower bound must be stated. Since the problem dimension n is fixed for any given instance, a positive lower bound that depends on n but is independent of the iteration index and of f is sufficient for the Borel–Cantelli argument. In the revision we will add a lemma deriving such a bound from the random-subspace sampling procedure and the geometry conditions of Section 3, thereby completing the almost-sure convergence proof for general smooth objectives. revision: yes
Referee: [§3] Definition of Q-fully quadratic model and the error-bound statements (around Eq. (3.4)–(3.7)): the claimed O(Δ²) gradient and O(Δ) Hessian error bounds are asserted to hold with constants independent of the random subspace, yet the poisedness constants for the projected point set are allowed to depend on the particular realization of the subspace; this dependence must be controlled uniformly to justify the trust-region step analysis that follows.

Authors: The current statements permit the poisedness constant to depend on the realized subspace, which could make the error constants non-uniform. We will revise Section 3 to establish a uniform upper bound on the poisedness constant that holds for all subspaces satisfying the quality condition used later in the analysis. With this uniform control the O(Δ²) gradient and O(Δ) Hessian bounds will hold with constants independent of the particular subspace realization, supporting the trust-region arguments in Section 4. revision: yes

Circularity Check

0 steps flagged

No circularity: convergence rests on independent analysis of random-subspace geometry and model error bounds

full rationale

The paper claims to provide an explicit technique for constructing Q-fully quadratic models in random subspaces, then derives almost-sure global convergence and an expected-iteration bound from the resulting geometry and approximation-error controls. No quoted step equates a derived quantity to a fitted parameter by construction, nor does any load-bearing premise reduce to a self-citation whose content is itself unverified. The derivation chain is therefore self-contained against external benchmarks (poisedness constants, error bounds) rather than tautological.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Review performed from abstract only; no explicit free parameters, invented entities, or non-standard axioms are stated. Standard smoothness assumptions for quadratic approximation are implied by the DFO context.

axioms (1)

domain assumption Objective function is sufficiently smooth for local quadratic approximations to exist with controllable error.
Required for any model-based DFO method; implied by the paper's focus on error bounds and quadratic models.

pith-pipeline@v0.9.0 · 5728 in / 1181 out tokens · 45709 ms · 2026-05-24T05:15:18.754693+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We provide a technique to construct Q-fully quadratic models... almost-sure global convergence result... upper bound on the expected number of iterations
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

σ_min(Dk) maximized... αD-well aligned... probability at least 1−δS

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

36 extracted references · 36 canonical work pages · 1 internal anchor

[1]

Alarie, C

S. Alarie, C. Audet, A. E. Gheribi, M. Kokkolaras, and S. Le Digabel , Two decades of blackbox optimization applications, EURO J. Comput. Optim., 9 (2021), p. 100011

work page 2021
[2]

Alzantot, Y

M. Alzantot, Y. Sharma, S. Chakraborty, H. Zhang, C. Hsieh, and M. B. Srivastava , GenAttack: practical black-box attacks with gradient-free optimization , in Proceedings of the Genetic and Evolutionary Computation Conference, 2019, pp. 1111–1119

work page 2019
[3]

Audet, J

C. Audet, J. E. Dennis Jr, and S. Le Digabel , Parallel space decomposition of the mesh adaptive direct search algorithm, SIAM J. Optim., 19 (2008), pp. 1150–1170

work page 2008
[4]

Audet and W

C. Audet and W. Hare , Derivative-free and blackbox optimization , Springer, Cham, 2017

work page 2017
[5]

, Model-based methods in derivative-free nonsmooth optimization , Springer, Cham, 2020, pp. 655–691

work page 2020
[6]

Cartis and L

C. Cartis and L. Roberts , Scalable subspace methods for derivative-free nonlinear least-squares optimiza- tion, Math. Program., 199 (2023), pp. 461–524

work page 2023
[7]

P. Chen, H. Zhang, Y. Sharma, J. Yi, and C. Hsieh, ZOO: zeroth order optimization based black-box attacks to deep neural networks without training substitute models , in Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security, 2017, pp. 15–26

work page 2017
[8]

Chen and W

Y. Chen and W. Hare , Adapting the centred simplex gradient to compensate for misaligned sample points , IMA J. Numer. Anal., (2023). 36 Yiwen Chen et al

work page 2023
[9]

Y. Chen, W. Hare, and G. Jarry-Bolduc , Error analysis of surrogate models constructed through opera- tions on submodels , Math. Oper. Res., (2022)

work page 2022
[10]

A. Conn, K. Scheinberg, and L. N. Vicente, Geometry of interpolation sets in derivative free optimization , Math. Program., 111 (2008), pp. 141–172

work page 2008
[11]

, Geometry of sample sets in derivative-free optimization: polynomial regression and underdetermined interpolation, IMA J. Numer. Anal., 28 (2008), pp. 721–748

work page 2008
[12]

, Introduction to derivative-free optimization , SIAM, 2009

work page 2009
[13]

A. Conn, P. L. Toint, A. Sartenaer, and N. Gould , On iterated-subspace minimization methods for nonlinear optimization, tech. rep., Rutherford Appleton Laboratory, 1994

work page 1994
[14]

J. E. Dennis Jr. and R. B. Schnabel , Numerical methods for unconstrained optimization and nonlinear equations, SIAM, 1996

work page 1996
[15]

K. J. Dzahini and S. M. Wild, Stochastic trust-region algorithm in random subspaces with convergence and expected complexity analyses , (2022). arXiv:2207.06452

work page arXiv 2022
[16]

Feurer and F

M. Feurer and F. Hutter , Hyperparameter optimization, Springer, Cham, 2019, pp. 3–33

work page 2019
[17]

Fukushima, Parallel variable transformation in unconstrained optimization , SIAM J

M. Fukushima, Parallel variable transformation in unconstrained optimization , SIAM J. Optim., 8 (1998), pp. 658–672

work page 1998
[18]

Black-Box Optimization in Machine Learning with Trust Region Based Derivative Free Algorithm

H. Ghanbari and K. Scheinberg, Black-box optimization in machine learning with trust region based deriva- tive free algorithm , (2017). arXiv:1703.06925

work page internal anchor Pith review Pith/arXiv arXiv 2017
[19]

Gould, D

N. Gould, D. Orban, and P. L. Toint, CUTEst: a constrained and unconstrained testing environment with safe threads for mathematical optimization , Comput. Optim. Appl., 60 (2015), pp. 545–557

work page 2015
[20]

G. N. Grapiglia, J. Yuan, and Y. Yuan , A subspace version of the Powell–Yuan trust-region algorithm for equality constrained optimization , J. Oper. Res. Soc. China, 1 (2013), pp. 425–451

work page 2013
[21]

Gratton, C

S. Gratton, C. W. Royer, L. N. Vicente, and Z. Zhang , Direct search based on probabilistic descent , SIAM J. Optim., 25 (2015), pp. 1515–1541

work page 2015
[22]

Hare and G

W. Hare and G. Jarry-Bolduc, Calculus identities for generalized simplex gradients: rules and applications , SIAM J. Optim., 30 (2020), pp. 853–884

work page 2020
[23]

W. Hare, G. Jarry-Bolduc, and C. Planiden , A matrix algebra approach to approximate hessians , IMA J. Numer. Anal., (2023)

work page 2023
[24]

W. Hare, L. Roberts, and C. W. Royer , Expected decrease for derivative-free algorithms using random subspaces, (2023). arXiv:2308.04734

work page arXiv 2023
[25]

R. A. Horn and C. R. Johnson , Topics in matrix analysis , Cambridge University Press, 1991

work page 1991
[26]

Larson, M

J. Larson, M. Menickelly, and S. M. Wild, Derivative-free optimization methods, Acta Numer., 28 (2019), pp. 287–404

work page 2019
[27]

Z. Li, F. Liu, W. Yang, S. Peng, and J. Zhou , A survey of convolutional neural networks: analysis, applications, and prospects , IEEE Trans. Neural Netw. Learn. Syst., (2021)

work page 2021
[28]

Liuzzi, S

G. Liuzzi, S. Lucidi, F. Rinaldi, and L. N. Vicente, Trust-region methods for the derivative-free optimiza- tion of nonsmooth black-box functions , SIAM J. Optim., 29 (2019), pp. 3012–3035

work page 2019
[29]

J. J. Mor ´e and S. M. Wild , Benchmarking derivative-free optimization algorithms , SIAM J. Optim., 20 (2009), pp. 172–191

work page 2009
[30]

Penrose, A generalized inverse for matrices , in Mathematical Proceedings of the Cambridge Philosophical Society, vol

R. Penrose, A generalized inverse for matrices , in Mathematical Proceedings of the Cambridge Philosophical Society, vol. 51, Cambridge University Press, 1955, pp. 406–413

work page 1955
[31]

M. J. D. Powell, The NEWUOA software for unconstrained optimization without derivatives , Springer US, Boston, MA, 2006, pp. 255–297

work page 2006
[32]

, Developments of NEWUOA for minimization without derivatives , IMA J. Numer. Anal., 28 (2008), pp. 649–664

work page 2008
[33]

rep., University of Cambridge, 2009

, The BOBYQA algorithm for bound constrained optimization without derivatives , tech. rep., University of Cambridge, 2009

work page 2009
[34]

Roberts and C

L. Roberts and C. W. Royer , Direct search based on probabilistic descent in reduced spaces , SIAM J. Optim., 33 (2023), pp. 3057–3082

work page 2023
[35]

Zhang, On derivative-free optimization methods (in Chinese) , PhD thesis, Chinese Academy of Sciences,

Z. Zhang, On derivative-free optimization methods (in Chinese) , PhD thesis, Chinese Academy of Sciences,

work page
[36]

https://www.zhangzk.net/docs/publications/thesis.pdf

work page

[1] [1]

Alarie, C

S. Alarie, C. Audet, A. E. Gheribi, M. Kokkolaras, and S. Le Digabel , Two decades of blackbox optimization applications, EURO J. Comput. Optim., 9 (2021), p. 100011

work page 2021

[2] [2]

Alzantot, Y

M. Alzantot, Y. Sharma, S. Chakraborty, H. Zhang, C. Hsieh, and M. B. Srivastava , GenAttack: practical black-box attacks with gradient-free optimization , in Proceedings of the Genetic and Evolutionary Computation Conference, 2019, pp. 1111–1119

work page 2019

[3] [3]

Audet, J

C. Audet, J. E. Dennis Jr, and S. Le Digabel , Parallel space decomposition of the mesh adaptive direct search algorithm, SIAM J. Optim., 19 (2008), pp. 1150–1170

work page 2008

[4] [4]

Audet and W

C. Audet and W. Hare , Derivative-free and blackbox optimization , Springer, Cham, 2017

work page 2017

[5] [5]

, Model-based methods in derivative-free nonsmooth optimization , Springer, Cham, 2020, pp. 655–691

work page 2020

[6] [6]

Cartis and L

C. Cartis and L. Roberts , Scalable subspace methods for derivative-free nonlinear least-squares optimiza- tion, Math. Program., 199 (2023), pp. 461–524

work page 2023

[7] [7]

P. Chen, H. Zhang, Y. Sharma, J. Yi, and C. Hsieh, ZOO: zeroth order optimization based black-box attacks to deep neural networks without training substitute models , in Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security, 2017, pp. 15–26

work page 2017

[8] [8]

Chen and W

Y. Chen and W. Hare , Adapting the centred simplex gradient to compensate for misaligned sample points , IMA J. Numer. Anal., (2023). 36 Yiwen Chen et al

work page 2023

[9] [9]

Y. Chen, W. Hare, and G. Jarry-Bolduc , Error analysis of surrogate models constructed through opera- tions on submodels , Math. Oper. Res., (2022)

work page 2022

[10] [10]

A. Conn, K. Scheinberg, and L. N. Vicente, Geometry of interpolation sets in derivative free optimization , Math. Program., 111 (2008), pp. 141–172

work page 2008

[11] [11]

, Geometry of sample sets in derivative-free optimization: polynomial regression and underdetermined interpolation, IMA J. Numer. Anal., 28 (2008), pp. 721–748

work page 2008

[12] [12]

, Introduction to derivative-free optimization , SIAM, 2009

work page 2009

[13] [13]

A. Conn, P. L. Toint, A. Sartenaer, and N. Gould , On iterated-subspace minimization methods for nonlinear optimization, tech. rep., Rutherford Appleton Laboratory, 1994

work page 1994

[14] [14]

J. E. Dennis Jr. and R. B. Schnabel , Numerical methods for unconstrained optimization and nonlinear equations, SIAM, 1996

work page 1996

[15] [15]

K. J. Dzahini and S. M. Wild, Stochastic trust-region algorithm in random subspaces with convergence and expected complexity analyses , (2022). arXiv:2207.06452

work page arXiv 2022

[16] [16]

Feurer and F

M. Feurer and F. Hutter , Hyperparameter optimization, Springer, Cham, 2019, pp. 3–33

work page 2019

[17] [17]

Fukushima, Parallel variable transformation in unconstrained optimization , SIAM J

M. Fukushima, Parallel variable transformation in unconstrained optimization , SIAM J. Optim., 8 (1998), pp. 658–672

work page 1998

[18] [18]

Black-Box Optimization in Machine Learning with Trust Region Based Derivative Free Algorithm

H. Ghanbari and K. Scheinberg, Black-box optimization in machine learning with trust region based deriva- tive free algorithm , (2017). arXiv:1703.06925

work page internal anchor Pith review Pith/arXiv arXiv 2017

[19] [19]

Gould, D

N. Gould, D. Orban, and P. L. Toint, CUTEst: a constrained and unconstrained testing environment with safe threads for mathematical optimization , Comput. Optim. Appl., 60 (2015), pp. 545–557

work page 2015

[20] [20]

G. N. Grapiglia, J. Yuan, and Y. Yuan , A subspace version of the Powell–Yuan trust-region algorithm for equality constrained optimization , J. Oper. Res. Soc. China, 1 (2013), pp. 425–451

work page 2013

[21] [21]

Gratton, C

S. Gratton, C. W. Royer, L. N. Vicente, and Z. Zhang , Direct search based on probabilistic descent , SIAM J. Optim., 25 (2015), pp. 1515–1541

work page 2015

[22] [22]

Hare and G

W. Hare and G. Jarry-Bolduc, Calculus identities for generalized simplex gradients: rules and applications , SIAM J. Optim., 30 (2020), pp. 853–884

work page 2020

[23] [23]

W. Hare, G. Jarry-Bolduc, and C. Planiden , A matrix algebra approach to approximate hessians , IMA J. Numer. Anal., (2023)

work page 2023

[24] [24]

W. Hare, L. Roberts, and C. W. Royer , Expected decrease for derivative-free algorithms using random subspaces, (2023). arXiv:2308.04734

work page arXiv 2023

[25] [25]

R. A. Horn and C. R. Johnson , Topics in matrix analysis , Cambridge University Press, 1991

work page 1991

[26] [26]

Larson, M

J. Larson, M. Menickelly, and S. M. Wild, Derivative-free optimization methods, Acta Numer., 28 (2019), pp. 287–404

work page 2019

[27] [27]

Z. Li, F. Liu, W. Yang, S. Peng, and J. Zhou , A survey of convolutional neural networks: analysis, applications, and prospects , IEEE Trans. Neural Netw. Learn. Syst., (2021)

work page 2021

[28] [28]

Liuzzi, S

G. Liuzzi, S. Lucidi, F. Rinaldi, and L. N. Vicente, Trust-region methods for the derivative-free optimiza- tion of nonsmooth black-box functions , SIAM J. Optim., 29 (2019), pp. 3012–3035

work page 2019

[29] [29]

J. J. Mor ´e and S. M. Wild , Benchmarking derivative-free optimization algorithms , SIAM J. Optim., 20 (2009), pp. 172–191

work page 2009

[30] [30]

Penrose, A generalized inverse for matrices , in Mathematical Proceedings of the Cambridge Philosophical Society, vol

R. Penrose, A generalized inverse for matrices , in Mathematical Proceedings of the Cambridge Philosophical Society, vol. 51, Cambridge University Press, 1955, pp. 406–413

work page 1955

[31] [31]

M. J. D. Powell, The NEWUOA software for unconstrained optimization without derivatives , Springer US, Boston, MA, 2006, pp. 255–297

work page 2006

[32] [32]

, Developments of NEWUOA for minimization without derivatives , IMA J. Numer. Anal., 28 (2008), pp. 649–664

work page 2008

[33] [33]

rep., University of Cambridge, 2009

, The BOBYQA algorithm for bound constrained optimization without derivatives , tech. rep., University of Cambridge, 2009

work page 2009

[34] [34]

Roberts and C

L. Roberts and C. W. Royer , Direct search based on probabilistic descent in reduced spaces , SIAM J. Optim., 33 (2023), pp. 3057–3082

work page 2023

[35] [35]

Zhang, On derivative-free optimization methods (in Chinese) , PhD thesis, Chinese Academy of Sciences,

Z. Zhang, On derivative-free optimization methods (in Chinese) , PhD thesis, Chinese Academy of Sciences,

work page

[36] [36]

https://www.zhangzk.net/docs/publications/thesis.pdf

work page