ShapeBench: A Scalable Benchmark and Diagnostic Suite for Standardized Evaluation in Aerodynamic Shape Optimization

Jack Guo; Krissh Chawla; Madeleine Udell; Matthias Ihme; Shaghayegh Fazliani; Yiren Shen

arxiv: 2605.20763 · v1 · pith:6XHSE5AQnew · submitted 2026-05-20 · 💻 cs.LG

ShapeBench: A Scalable Benchmark and Diagnostic Suite for Standardized Evaluation in Aerodynamic Shape Optimization

Shaghayegh Fazliani , Krissh Chawla , Jack Guo , Yiren Shen , Matthias Ihme , Madeleine Udell This is my paper

Pith reviewed 2026-05-21 06:29 UTC · model grok-4.3

classification 💻 cs.LG

keywords aerodynamic shape optimizationbenchmarkoptimizer evaluationsurrogate modelsevolutionary algorithmsLLM-driven optimizationshape categoriesfidelity gap

0 comments

The pith

A new benchmark for aerodynamic shape optimization reveals that optimizer performance rankings change dramatically across different shape categories and problem types.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces ShapeBench to address the lack of standardized evaluation in aerodynamic shape optimization. It provides a unified set of 103 tasks across eight shape categories with validated surrogates for efficient testing and options for high-fidelity checks. By including consistent baselines and a new LLM-based method, the benchmark shows that optimizer effectiveness does not transfer well between tasks. This matters because relying on results from single problems can lead to misleading conclusions about which methods work best in real applications. Researchers can now compare approaches more reliably and identify where general solutions are still needed.

Core claim

ShapeBench is an open-source benchmark with a unified API for 103 aerodynamic shape optimization tasks spanning eight shape categories and multiple regimes. Each task comes with a validated surrogate model for fast optimization and, where possible, a CFD pipeline for verification. Using a consistent budget metric, comparisons of classical optimizers and LLM-driven methods, including the new ShapeEvolve baseline, show substantial variance in rankings with a mean pairwise Spearman correlation of only 0.013. This indicates that single-task results do not generalize reliably across problem classes, and classical methods are not broadly applicable.

What carries the argument

ShapeBench, a scalable benchmark suite that standardizes tasks, provides surrogates for search, and uses a matched-budget protocol to enable fair comparisons across optimizers and shape classes.

Load-bearing premise

The surrogates used in the benchmark are accurate enough representations of the true optimization landscapes to make search results meaningful.

What would settle it

If a broad set of optimizers shows consistently similar ranking orders when evaluated across all eight shape categories using the same budget, that would contradict the observed variance.

Figures

Figures reproduced from arXiv: 2605.20763 by Jack Guo, Krissh Chawla, Madeleine Udell, Matthias Ihme, Shaghayegh Fazliani, Yiren Shen.

**Figure 2.** Figure 2: ShapeBench generates design- and run-level visuals (e.g., geometry/field plots and opti [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: Overview of the ShapeEvolve pipeline. and benchmarked against all baselines. Figure 4a shows evaluation across diverse shape categories and task setups can surface optimizer behaviors that are not visible in single-task studies — that is, performance is strongly task-dependent. For instance, Bayesian optimization and PSO perform best on the CERAS and delta-wing tasks, but rank among the weakest methods on … view at source ↗

**Figure 4.** Figure 4: (a) Final median objective values for each task per optimizer. Colors are column-wise [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗

**Figure 5.** Figure 5: shows a representative CCA experiment and illustrates ShapeBench’s cross-optimizer visualization tools. Panel 5(b) also compares the best designs from each optimizer against the reference geometry produced by nTop[31]. Optimizers behave differently on this task than on many others from Figure 4a: ShapeEvolve converges to the best L/D, significantly outperforming classical methods. The source of this distin… view at source ↗

**Figure 6.** Figure 6: (a) Convergence plot (objective vs. evaluations) plot for the 2D airfoil multi-point drag [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗

**Figure 7.** Figure 7: (a) Best designs for the baseline and for each method (2D side views and 3D isometric [PITH_FULL_IMAGE:figures/full_fig_p008_7.png] view at source ↗

**Figure 8.** Figure 8: Median normalized rank trajectory of each optimizer over relative evaluation budget across [PITH_FULL_IMAGE:figures/full_fig_p017_8.png] view at source ↗

**Figure 9.** Figure 9: Schematic of the proposed VortexNet graph neural network (GNN) architecture. Colors indicate different network blocks. A black arrow indicates represents direct message passing between blocks, and a blue arrow denotes a skip connection to the receiving block. The figure also presents snapshots of the graph at each computational step, showing nodes, edges, and their associated feature arrays. Figure is from… view at source ↗

**Figure 10.** Figure 10: The HF surface pressure coefficient obtained from CFD for a wing with [PITH_FULL_IMAGE:figures/full_fig_p028_10.png] view at source ↗

**Figure 11.** Figure 11: Overall architecture of the surrogate model. The network consists of two main components: [PITH_FULL_IMAGE:figures/full_fig_p029_11.png] view at source ↗

**Figure 12.** Figure 12: Illustration of the blended wing body (BWB) planform parameterization, showing key [PITH_FULL_IMAGE:figures/full_fig_p030_12.png] view at source ↗

**Figure 13.** Figure 13: Three-view diagram of a typical kink wing. Figure is from [ [PITH_FULL_IMAGE:figures/full_fig_p031_13.png] view at source ↗

**Figure 14.** Figure 14: correlation of predicted aerodynamic coefficient CL and CD against VLM data for [PITH_FULL_IMAGE:figures/full_fig_p032_14.png] view at source ↗

**Figure 15.** Figure 15: CCA: Collaborative Combat Aircraft Parameter Lower Bound Upper Bound Dihedral angle (deg) 0.25 15 Max Wing Blend (mm) 25 1000 Inlet Angle 1 (deg) 0 45 Inlet Angle 2 (deg) 0 10 Wing Position 0.22 0.51 Rear Point (mm) (4500, 0, 0) (7500, 0, 0) . . . . . . . . . Inlet Location 0.2 0.6 NACA 4-digit code {1412, 0012, 2408, 4412} Fore Top Angle (deg) 0 10 Aft Top Angle (deg) 12 32.5 Top Height Aft (mm) 36 220 B… view at source ↗

**Figure 16.** Figure 16: Grid convergence [PITH_FULL_IMAGE:figures/full_fig_p033_16.png] view at source ↗

**Figure 17.** Figure 17: Sampled CCA designs representative of geometries in [PITH_FULL_IMAGE:figures/full_fig_p034_17.png] view at source ↗

**Figure 18.** Figure 18: COCOANet CCA surrogate data characteristics [PITH_FULL_IMAGE:figures/full_fig_p034_18.png] view at source ↗

**Figure 19.** Figure 19: correlation of predicted aerodynamic coefficient CL and CD against CFD data [PITH_FULL_IMAGE:figures/full_fig_p035_19.png] view at source ↗

**Figure 20.** Figure 20: Single point lift-to-drag optimization results [PITH_FULL_IMAGE:figures/full_fig_p037_20.png] view at source ↗

**Figure 21.** Figure 21: Best shape per method for lift-to-drag delta wing task [PITH_FULL_IMAGE:figures/full_fig_p037_21.png] view at source ↗

**Figure 22.** Figure 22: Convergence trajectories for optimizers for two point Vortnet task [PITH_FULL_IMAGE:figures/full_fig_p038_22.png] view at source ↗

**Figure 23.** Figure 23: Best Design Overlay of Delta Wingdesign for two point objective [PITH_FULL_IMAGE:figures/full_fig_p038_23.png] view at source ↗

**Figure 24.** Figure 24: Objectives vs. evaluations plot for 3D BWB Design multipoint tasks. As given in [PITH_FULL_IMAGE:figures/full_fig_p042_24.png] view at source ↗

**Figure 25.** Figure 25: Best designs, 2D planform and 3D isometric views, for the min- [PITH_FULL_IMAGE:figures/full_fig_p043_25.png] view at source ↗

**Figure 26.** Figure 26: Cross-objective scatter plot for the min- [PITH_FULL_IMAGE:figures/full_fig_p044_26.png] view at source ↗

**Figure 27.** Figure 27: Objectives vs. evaluations plot for max- [PITH_FULL_IMAGE:figures/full_fig_p045_27.png] view at source ↗

**Figure 28.** Figure 28: Best designs, 2D planform and 3D isometric views, max- [PITH_FULL_IMAGE:figures/full_fig_p045_28.png] view at source ↗

**Figure 29.** Figure 29: Best designs, 2D planform overlay comparisons across both objectives. Left-hand side [PITH_FULL_IMAGE:figures/full_fig_p046_29.png] view at source ↗

**Figure 30.** Figure 30: 3D BWB experiment with different initializations; (a) reward vs. evaluations & (b) BWB [PITH_FULL_IMAGE:figures/full_fig_p047_30.png] view at source ↗

**Figure 31.** Figure 31: Three-stage optimization pipeline for the 2D airfoil single-point maximum lift-to-drag task. [PITH_FULL_IMAGE:figures/full_fig_p050_31.png] view at source ↗

**Figure 32.** Figure 32: Stage 1 convergence plot (objective vs. evaluations) of the four optimization methods for [PITH_FULL_IMAGE:figures/full_fig_p051_32.png] view at source ↗

**Figure 33.** Figure 33: Overlay of best IPOPT-refined airfoil profiles (with [PITH_FULL_IMAGE:figures/full_fig_p051_33.png] view at source ↗

**Figure 34.** Figure 34: Convergence plot (objective vs. evaluations) plot for the 2D airfoil multi-point drag [PITH_FULL_IMAGE:figures/full_fig_p052_34.png] view at source ↗

**Figure 35.** Figure 35: Airfoil design profiles (y/c vs. x/c) for best-performing designs from all five methods, for the 2D airfoil multi-point drag minimization task. Adjoint (IPOPT) ShapeEvolve PSO (120p×500i) L-BFGS-B Bayes. Opt. (exact GP) 0.0775 0.0800 0.0825 0.0850 0.0875 0.0900 0.0925 0.0950 0.0975 W eig hte d CD (lo w e r is b ette r) Best-design XFOIL vs NeuralFoil evaluation multipoint CL targets (CL 2 © 0:8; 1:0; 1:2;… view at source ↗

**Figure 36.** Figure 36: NeuralFoil and XFOIL evaluations of best design for each method, for the 2D airfoil multi-point drag minimization task. • thin airfoil height, relative to that of the 2D airfoil single-point maximum lift-to-drag task seen in figure 33 • moderate camber (maximum camber value of y/c ≃ 0.12 near x/c ≃ 0.35) The Bayesian optimization design is over-cambered and thicker overall, with a correspondingly larger C… view at source ↗

**Figure 37.** Figure 37: Single Objective SuperWing case: minimize CD with CL constraint [PITH_FULL_IMAGE:figures/full_fig_p055_37.png] view at source ↗

**Figure 38.** Figure 38: SuperWing best design overlay per method for single objective drag minimization Operating points M0 ∈ {0.75, 0.80, 0.86, 0.90} Objective min x f(x) = min x 1 K X k∈K   −Mk CL(x; α (k) 0 , Mk) CD(x; α (k) 0 , Mk) λ [PITH_FULL_IMAGE:figures/full_fig_p055_38.png] view at source ↗

**Figure 39.** Figure 39: Optimizer trajectories for Multi point range maximization problem for [PITH_FULL_IMAGE:figures/full_fig_p056_39.png] view at source ↗

**Figure 40.** Figure 40: Optimum bets design per method for SuperWing multi-point range maximization problem Objective min x f(x) = min x X j wj   −M0 CL(x; α (j) 0 , M0) CD(x; α (j) 0 , M0) λ [PITH_FULL_IMAGE:figures/full_fig_p056_40.png] view at source ↗

**Figure 41.** Figure 41: ShapeBench SuperWing problem 30: optimization methods and results [PITH_FULL_IMAGE:figures/full_fig_p057_41.png] view at source ↗

**Figure 42.** Figure 42: ShapeBench SuperWing problem 30: best design overlay [PITH_FULL_IMAGE:figures/full_fig_p057_42.png] view at source ↗

**Figure 43.** Figure 43: Enter Caption D.5 3D Collaborative Combat Aircraft (CCA) Design Design variables We use the following 3D single-duct drone parametrization from nTop [31]. 57 [PITH_FULL_IMAGE:figures/full_fig_p057_43.png] view at source ↗

**Figure 44.** Figure 44: CCA evaluation for optimization of lift-to-drag ratio. (a) optimization method performance [PITH_FULL_IMAGE:figures/full_fig_p059_44.png] view at source ↗

**Figure 45.** Figure 45: Convergence plot (objective vs. evaluations) for minimized [PITH_FULL_IMAGE:figures/full_fig_p060_45.png] view at source ↗

**Figure 46.** Figure 46: Best designs for the baseline and for each method (2D side views and 3D isometric views) [PITH_FULL_IMAGE:figures/full_fig_p060_46.png] view at source ↗

**Figure 47.** Figure 47: CERAS fuelmass objective optimization results [PITH_FULL_IMAGE:figures/full_fig_p062_47.png] view at source ↗

**Figure 48.** Figure 48: Best designs per method overlayed for CERAS fuelmass case [PITH_FULL_IMAGE:figures/full_fig_p063_48.png] view at source ↗

read the original abstract

Rapid progress in aerodynamic shape optimization (ASO) has outpaced currently-available standardized evaluation frameworks. Fair comparison requires a unified benchmark spanning diverse shape classes, objective formulations, and matched-budget state-of-the-art baselines. We introduce ShapeBench, an open-source ASO benchmark with a unified API spanning 103 tasks across eight shape categories and multiple optimization regimes. Each ShapeBench task includes a validated surrogate for fast search; when feasible, a high-fidelity Computational Fluid Dynamics (CFD) pipeline for final verification is available, enabling systematic fidelity-gap analysis. ShapeBench provides a reproducible protocol with well-configured baselines to compare fairly using a consistent budget metric, allowing for comparison among both classical and LLM-driven methods, including general-purpose optimizers and a new domain-specialized evolutionary LLM baseline, ShapeEvolve. Results on ShapeBench demonstrate substantial variance in optimizer rankings across shape categories and problem formulations, with mean pairwise Spearman $\rho = 0.013$, so single-task conclusions do not reliably generalize across problem classes. The benchmark is also far from saturation; classical methods are rarely applicable across all shape categories and tasks, further highlighting the need for more general-purpose approaches.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 1 minor

Summary. The manuscript introduces ShapeBench, an open-source benchmark for aerodynamic shape optimization (ASO) with a unified API spanning 103 tasks across eight shape categories and multiple optimization regimes. Each task supplies a validated surrogate for fast search and, when feasible, a high-fidelity CFD pipeline for verification and fidelity-gap analysis. The benchmark supplies reproducible protocols and matched-budget baselines that include both classical optimizers and LLM-driven methods, notably a new domain-specialized evolutionary LLM baseline (ShapeEvolve). The central empirical result is substantial variance in optimizer rankings across shape categories and problem formulations, quantified by a mean pairwise Spearman ρ of 0.013, from which the authors conclude that single-task conclusions do not reliably generalize.

Significance. If the surrogates are demonstrated to preserve relative optimizer rankings that would be obtained under high-fidelity CFD, ShapeBench would provide a much-needed standardized, scalable evaluation framework for ASO that enables fair comparison of classical and emerging LLM-based approaches. The public release, consistent budget metric, and explicit support for fidelity-gap studies are clear strengths. The reported low cross-task correlation usefully challenges the common practice of drawing broad conclusions from single-task experiments. The overall significance remains moderate until quantitative surrogate validation is supplied.

major comments (1)

[Abstract] Abstract (paragraph on task structure and baselines): The central claim that observed ranking variance (mean pairwise Spearman ρ = 0.013) demonstrates failure of single-task conclusions to generalize rests on the assumption that surrogate error does not distort relative optimizer performance. The abstract states that surrogates are “validated” and that a CFD pipeline enables “systematic fidelity-gap analysis,” yet no quantitative validation statistics—RMSE, rank correlation with CFD on optimized designs, or per-category fidelity-gap tables—are reported. Without these, it remains possible that non-uniform approximation error across shape categories inflates the reported variance.

minor comments (1)

[Abstract] Abstract: The phrase “mean pairwise Spearman ρ = 0.013” would benefit from an explicit statement of the exact set of optimizer pairs and tasks over which the mean is taken, and whether the value is computed on final performance or on the entire optimization trajectory.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the detailed and constructive feedback. The concern about surrogate validation and its potential effect on the reported ranking variance is a substantive point that merits direct response. We address it below and outline planned revisions.

read point-by-point responses

Referee: The central claim that observed ranking variance (mean pairwise Spearman ρ = 0.013) demonstrates failure of single-task conclusions to generalize rests on the assumption that surrogate error does not distort relative optimizer performance. The abstract states that surrogates are “validated” and that a CFD pipeline enables “systematic fidelity-gap analysis,” yet no quantitative validation statistics—RMSE, rank correlation with CFD on optimized designs, or per-category fidelity-gap tables—are reported. Without these, it remains possible that non-uniform approximation error across shape categories inflates the reported variance.

Authors: We agree that the interpretation of the low mean pairwise Spearman ρ = 0.013 as evidence against generalizing from single tasks implicitly assumes that surrogate approximation error does not systematically alter relative optimizer rankings across categories. Each surrogate is trained on CFD-generated data and the benchmark supplies a high-fidelity CFD pipeline for verification on feasible tasks, but the current manuscript does not report explicit quantitative metrics (RMSE on held-out CFD points, rank correlation of final designs, or per-category fidelity-gap tables) that would directly test preservation of optimizer orderings. In the revised manuscript we will add a dedicated validation subsection containing these statistics on a representative subset of tasks, together with an updated abstract that references the new results. This addition will allow readers to evaluate the possible contribution of non-uniform surrogate error to the observed variance. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper introduces ShapeBench as an open benchmark with 103 tasks and reports an empirical finding of low mean pairwise Spearman ρ = 0.013 in optimizer rankings across categories. This statistic is obtained by running the listed baselines on the defined tasks and surrogates; it is a direct measurement rather than a quantity derived from prior inputs by construction, fitted parameter, or self-citation chain. No equations, uniqueness theorems, or ansatzes are invoked that reduce the central claim to the benchmark definition itself. The result is therefore self-contained as an observation on a publicly released resource.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claims rest on the assumption that the 103 tasks and surrogates form a representative and validated testbed for comparing optimizers across regimes.

axioms (1)

domain assumption Surrogates are validated and sufficiently accurate for optimization search; high-fidelity CFD is available for verification when feasible.
Stated in the abstract description of each task.

pith-pipeline@v0.9.0 · 5756 in / 1204 out tokens · 37219 ms · 2026-05-21T06:29:54.381740+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Results on ShapeBench demonstrate substantial variance in optimizer rankings across shape categories and problem formulations, with mean pairwise Spearman ρ = 0.013
IndisputableMonolith/Foundation/AlphaCoordinateFixation.lean J_uniquely_calibrated_via_higher_derivative unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Each ShapeBench task includes a validated surrogate for fast search; when feasible, a high-fidelity Computational Fluid Dynamics (CFD) pipeline

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

59 extracted references · 59 canonical work pages · 5 internal anchors

[1]

Accelerating materials design via llm-guided evolutionary search.arXiv preprint arXiv:2510.22503, 2025

Nikhil Abhyankar, Sanchit Kabra, Saaketh Desai, and Chandan K Reddy. Accelerating materials design via llm-guided evolutionary search.arXiv preprint arXiv:2510.22503, 2025

work page arXiv 2025
[2]

Aerodynamic shape optimization benchmarks with error control and automatic parameterization

George R Anderson, Marian Nemec, and Michael J Aftosmis. Aerodynamic shape optimization benchmarks with error control and automatic parameterization. In53rd AIAA Aerospace Sciences Meeting, page 1719, 2015

work page 2015
[3]

Casadi—a soft- ware framework for nonlinear optimization and optimal control.Mathematical Programming Computation, 11(1):1–36, 2018

Joel Andersson, Joris Gillis, Greg Horn, Jim Rawlings, and Moritz Diehl. Casadi—a soft- ware framework for nonlinear optimization and optimal control.Mathematical Programming Computation, 11(1):1–36, 2018

work page 2018
[4]

Ashton, N., Mockett, C., Fuchs, M., Fliessbach, L., Het- mann, H., Knacke, T., Schonwald, N., Skaperdas, V ., Fotiadis, G., Walle, A., et al

Neil Ashton, Charles Mockett, Marian Fuchs, Louis Fliessbach, Hendrik Hetmann, Thilo Knacke, Norbert Schonwald, Vangelis Skaperdas, Grigoris Fotiadis, Astrid Walle, et al. Dri- vAerML: High-fidelity computational fluid dynamics dataset for road-car external aerodynamics. arXiv preprint arXiv:2408.11969, 2024

work page arXiv 2024
[5]

Botorch: A framework for efficient monte-carlo bayesian optimization

Maximilian Balandat, Brian Karrer, Daniel Jiang, Samuel Daulton, Ben Letham, Andrew G Wil- son, and Eytan Bakshy. Botorch: A framework for efficient monte-carlo bayesian optimization. Advances in neural information processing systems, 33:21524–21538, 2020

work page 2020
[6]

Zilliac, Jim S

Jennifer Dacles-Mariani, Gregory G. Zilliac, Jim S. Chow, and Peter Bradshaw. Numerical/- experimental study of a wingtip vortex in the near field.AIAA Journal, 33(9):1561–1568, 1995

work page 1995
[7]

From fast to fast-oad: An open source framework for rapid overall aircraft design

Christophe David, Scott Delbecq, Sebastien Defoort, Peter Schmollgruber, Emmanuel Benard, and Valerie Pommier-Budinger. From fast to fast-oad: An open source framework for rapid overall aircraft design. InIOP Conference Series: Materials Science and Engineering, volume 1024, page 012062. IOP Publishing, 2021

work page 2021
[8]

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, et al. An image is worth 16x16 words: Transformers for image recognition at scale.arXiv preprint arXiv:2010.11929, 2020

work page internal anchor Pith review Pith/arXiv arXiv 2010
[9]

Pros & cons of airfoil optimization

Mark Drela. Pros & cons of airfoil optimization. InFrontiers of Computational Fluid Dynamics 1998, pages 363–381. World Scientific, 1998

work page 1998
[10]

Drivaernet++: A large- scale multimodal car dataset with computational fluid dynamics simulations and deep learning benchmarks, 2025

Mohamed Elrefaie, Florin Morar, Angela Dai, and Faez Ahmed. Drivaernet++: A large- scale multimodal car dataset with computational fluid dynamics simulations and deep learning benchmarks, 2025

work page 2025
[11]

John Wiley & Sons, 2008

Alexander Forrester, Andras Sobester, and Andy Keane.Engineering design via surrogate modelling: a practical guide. John Wiley & Sons, 2008

work page 2008
[12]

A Tutorial on Bayesian Optimization

Peter I Frazier. A tutorial on bayesian optimization.arXiv preprint arXiv:1807.02811, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[13]

Design of optimal aerodynamic shapes using stochastic optimization methods and computational intelligence.Progress in Aerospace sciences, 38(1):43–76, 2002

KC Giannakoglou. Design of optimal aerodynamic shapes using stochastic optimization methods and computational intelligence.Progress in Aerospace sciences, 38(1):43–76, 2002

work page 2002
[14]

An introduction to the adjoint approach to design.Flow, turbulence and combustion, 65(3):393–415, 2000

Michael B Giles and Niles A Pierce. An introduction to the adjoint approach to design.Flow, turbulence and combustion, 65(3):393–415, 2000

work page 2000
[15]

Cma-es/pycma on github, 2019

Nikolaus Hansen, Youhei Akimoto, and Petr Baudis. Cma-es/pycma on github, 2019. 10

work page 2019
[16]

Coco: A platform for comparing continuous optimizers in a black-box setting.Optimization Methods and Software, 36(1):114–144, 2021

Nikolaus Hansen, Anne Auger, Raymond Ros, Olaf Mersmann, Tea Tušar, and Dimo Brockhoff. Coco: A platform for comparing continuous optimizers in a black-box setting.Optimization Methods and Software, 36(1):114–144, 2021

work page 2021
[17]

Completely derandomized self-adaptation in evolu- tion strategies.Evolutionary computation, 9(2):159–195, 2001

Nikolaus Hansen and Andreas Ostermeier. Completely derandomized self-adaptation in evolu- tion strategies.Evolutionary computation, 9(2):159–195, 2001

work page 2001
[18]

direct search

Robert Hooke and Terry A Jeeves. “direct search”solution of numerical and statistical problems. Journal of the ACM (JACM), 8(2):212–229, 1961

work page 1961
[19]

Aerodynamic design via control theory.Journal of scientific computing, 3(3):233–260, 1988

Antony Jameson. Aerodynamic design via control theory.Journal of scientific computing, 3(3):233–260, 1988

work page 1988
[20]

Particle swarm optimization

James Kennedy and Russell Eberhart. Particle swarm optimization. InProceedings of ICNN’95- international conference on neural networks, volume 4, pages 1942–1948. ieee, 1995

work page 1942
[21]

Aerodynamic shape optimization of the crm configuration including buffet-onset conditions

Gaetan K Kenway and Joaquim RRA Martins. Aerodynamic shape optimization of the crm configuration including buffet-onset conditions. In54th AIAA Aerospace Sciences Meeting, page 1294, 2016

work page 2016
[22]

Adam: A Method for Stochastic Optimization

Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization.arXiv preprint arXiv:1412.6980, 2014

work page internal anchor Pith review Pith/arXiv arXiv 2014
[23]

Universal parametric geometry representation method.Journal of aircraft, 45(1):142–158, 2008

Brenda M Kulfan. Universal parametric geometry representation method.Journal of aircraft, 45(1):142–158, 2008

work page 2008
[24]

Shinkaevolve: Towards open-ended and sample-efficient program evolution, 2025

Robert Tjarko Lange, Yuki Imajuku, and Edoardo Cetin. Shinkaevolve: Towards open-ended and sample-efficient program evolution, 2025

work page 2025
[25]

Study based on the aiaa aero- dynamic design optimization discussion group test cases.AIAA Journal, 53(7):1910–1935, 2015

Stephen T LeDoux, John C Vassberg, David P Young, Spencer Fugal, Dmitry Kamenetskiy, William P Huffman, Robin G Melvin, and Matthew F Smith. Study based on the aiaa aero- dynamic design optimization discussion group test cases.AIAA Journal, 53(7):1910–1935, 2015

work page 1910
[26]

Solving quantitative reasoning problems with language models, 2022

Aitor Lewkowycz, Anders Andreassen, David Dohan, Ethan Dyer, Henryk Michalewski, Vinay Ramasesh, Ambrose Slone, Cem Anil, Imanol Schlag, Theo Gutman-Solo, Yuhuai Wu, Behnam Neyshabur, Guy Gur-Ari, and Vedant Misra. Solving quantitative reasoning problems with language models, 2022

work page 2022
[27]

Afbench: A large-scale benchmark for airfoil design

Jian Liu, Jianyu Wu, Hairun Xie, Guoqing Zhang, Jing Wang, Wei Liu, Wanli Ouyang, Junjun Jiang, Xianming Liu, Shixiang Tang, et al. Afbench: A large-scale benchmark for airfoil design. Advances in Neural Information Processing Systems, 37:82757–82780, 2024

work page 2024
[28]

Decoupled Weight Decay Regularization

Ilya Loshchilov and Frank Hutter. Decoupled weight decay regularization.arXiv preprint arXiv:1711.05101, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[29]

Martins and Andrew B

Joaquim R.R.A. Martins and Andrew B. Lambe. Multidisciplinary design optimization: A survey of architectures.AIAA Journal, 51:2049–2075, 2013

work page 2049
[30]

MeshLib: Mesh Processing Library

MeshInspector. MeshLib: Mesh Processing Library. https://github.com/MeshInspector/ MeshLib, 2025. Version 3.0.9.196, accessed 2026-04-30

work page 2025
[31]

ntop (release 4.1), 2025

nTop Inc. ntop (release 4.1), 2025. Computational design software

work page 2025
[32]

Improving reproducibility in machine learning research (a report from the neurips 2019 reproducibility program), 2020

Joelle Pineau, Philippe Vincent-Lamarre, Koustuv Sinha, Vincent Larivière, Alina Beygelzimer, Florence d’Alché Buc, Emily Fox, and Hugo Larochelle. Improving reproducibility in machine learning research (a report from the neurips 2019 reproducibility program), 2020

work page 2019
[33]

Drivaerstar: An industrial-grade cfd dataset for vehicle aerodynamic optimization.arXiv preprint arXiv:2510.16857, 2025

Jiyan Qiu, Lyulin Kuang, Guan Wang, Yichen Xu, Leiyao Cui, Shaotong Fu, Yixin Zhu, and Ruihua Zhang. Drivaerstar: An industrial-grade cfd dataset for vehicle aerodynamic optimization.arXiv preprint arXiv:2510.16857, 2025

work page arXiv 2025
[34]

Surrogate-based analysis and optimization.Progress in aerospace sciences, 41(1):1–28, 2005

Nestor V Queipo, Raphael T Haftka, Wei Shyy, Tushar Goel, Rajkumar Vaidyanathan, and P Kevin Tucker. Surrogate-based analysis and optimization.Progress in aerospace sciences, 41(1):1–28, 2005. 11

work page 2005
[35]

Mathematical discoveries from program search with large language models

Bernardino Romera-Paredes, Mohammadamin Barekatain, Alexander Novikov, Matej Balog, M Pawan Kumar, Emilien Dupont, Francisco JR Ruiz, Jordan S Ellenberg, Pengming Wang, Omar Fawzi, et al. Mathematical discoveries from program search with large language models. Nature, 625(7995):468–475, 2024

work page 2024
[36]

Evolution Strategies as a Scalable Alternative to Reinforcement Learning

Tim Salimans, Jonathan Ho, Xi Chen, Szymon Sidor, and Ilya Sutskever. Evolution strategies as a scalable alternative to reinforcement learning.arXiv preprint arXiv:1703.03864, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[37]

Marco Saporito, Andrea Da Ronch, Nathalie Bartoli, and Sébastien Defoort. Robust multidis- ciplinary analysis and optimization for conceptual design of flexible aircraft under dynamic aeroelastic constraints.Aerospace Science and Technology, 138:108349, 2023

work page 2023
[38]

Bayesian optimization for mixed variables using an adaptive dimension reduction process: applications to aircraft design

Paul Saves, Nathalie Bartoli, Youssef Diouane, Thierry Lefebvre, Joseph Morlier, Christophe David, Eric Nguyen Van, and Sébastien Defoort. Bayesian optimization for mixed variables using an adaptive dimension reduction process: applications to aircraft design. InAIAA SciTech 2022 Forum, page 0082, 2022

work page 2022
[39]

Multidisciplinary design optimization with mixed categorical variables for aircraft design

Paul Saves, Nathalie Bartoli, Youssef Diouane, Thierry Lefebvre, Joseph Morlier, Christophe David, Eric Nguyen Van, and Sébastien Defoort. Multidisciplinary design optimization with mixed categorical variables for aircraft design. InAIAA SCITECH 2022 Forum. American Institute of Aeronautics and Astronautics, January 2022

work page 2022
[40]

Openevolve: Open-source implementation of alphaevolve, 2025

Asankhaya Sharma. Openevolve: Open-source implementation of alphaevolve, 2025. Accessed: 2026-05-07

work page 2025
[41]

John Hansman

Peter Sharpe and R. John Hansman. Neuralfoil: An airfoil aerodynamics analysis tool using physics-informed machine learning, 2025

work page 2025
[42]

Sharpe.Accelerating Practical Engineering Design Optimization with Computational Graph Transformations

Peter D. Sharpe.Accelerating Practical Engineering Design Optimization with Computational Graph Transformations. PhD thesis, Massachusetts Institute of Technology, 2024

work page 2024
[43]

PhD thesis, 2021

Peter D Sharpe and R John Hansman.Aerosandbox: A differentiable framework for aircraft design optimization. PhD thesis, 2021

work page 2021
[44]

Graph neural network-guided aerodynamic shape optimization for conceptual design of supersonic transport wings

Yiren Shen and Juan Alonso. Graph neural network-guided aerodynamic shape optimization for conceptual design of supersonic transport wings. InAIAA AVIATION FORUM AND ASCEND 2025, page 3228, 2025

work page 2025
[45]

V ortexnet: A graph neural network-based multi-fidelity surrogate model for field predictions

Yiren Shen, Jacob T Needels, and Juan J Alonso. V ortexnet: A graph neural network-based multi-fidelity surrogate model for field predictions. InAIAA SciTech 2025 Forum, page 0494, 2025

work page 2025
[46]

Jones, and Faez Ahmed

Nicholas Sung, Steven Spreizer, Mohamed Elrefaie, Matthew C. Jones, and Faez Ahmed. Blendednet++: A large-scale blended wing body aerodynamics dataset and benchmark, 2025

work page 2025
[47]

Jones, and Faez Ahmed

Nicholas Sung, Steven Spreizer, Mohamed Elrefaie, Kaira Samuel, Matthew C. Jones, and Faez Ahmed. Blendednet: A blended wing body aircraft dataset and surrogate model for aerodynamic predictions. InVolume 3B: 51st Design Automation Conference (DAC), IDETC- CIE2025. American Society of Mechanical Engineers, August 2025

work page 2025
[48]

Neural-solver-library: A library for advanced neural pde solvers

Tsinghua University Machine Learning Group (THUML). Neural-solver-library: A library for advanced neural pde solvers. https://github.com/thuml/Neural-Solver-Library, 2025. Last accessed: 2025-09-29

work page 2025
[49]

On the implementation of an interior-point filter line-search algorithm for large-scale nonlinear programming.Mathematical programming, 106(1):25–57, 2006

Andreas Wächter and Lorenz T Biegler. On the implementation of an interior-point filter line-search algorithm for large-scale nonlinear programming.Mathematical programming, 106(1):25–57, 2006

work page 2006
[50]

No free lunch theorems for optimization.IEEE transactions on evolutionary computation, 1(1):67–82, 2002

David H Wolpert and William G Macready. No free lunch theorems for optimization.IEEE transactions on evolutionary computation, 1(1):67–82, 2002

work page 2002
[51]

Transolver: A fast transformer solver for pdes on general geometries

Haixu Wu, Huakun Luo, Haowen Wang, Jianmin Wang, and Mingsheng Long. Transolver: A fast transformer solver for pdes on general geometries. InInternational Conference on Machine Learning, 2024. 12

work page 2024
[52]

Superwing: a comprehensive transonic wing dataset for data-driven aerodynamic design, 2025

Yunjia Yang, Weishao Tang, Mengxin Liu, Nils Thuerey, Yufei Zhang, and Haixin Chen. Superwing: a comprehensive transonic wing dataset for data-driven aerodynamic design, 2025

work page 2025
[53]

Using large language models for parametric shape opti- mization, 2024

Xinxin Zhang, Zhuoqun Xu, Guangpu Zhu, Chien Ming Jonathan Tay, Yongdong Cui, Boo Cheong Khoo, and Lailai Zhu. Using large language models for parametric shape opti- mization, 2024

work page 2024
[54]

Evo- lutionary optimization methods for high-dimensional expensive problems: A survey.IEEE/CAA Journal of Automatica Sinica, 11(5):1092–1105, 2024

MengChu Zhou, Meiji Cui, Dian Xu, Shuwei Zhu, Ziyan Zhao, and Abdullah Abusorrah. Evo- lutionary optimization methods for high-dimensional expensive problems: A survey.IEEE/CAA Journal of Automatica Sinica, 11(5):1092–1105, 2024

work page 2024
[55]

Engibench: A benchmark for evaluating large language models on engineering problem solving, 2025

Xiyuan Zhou, Xinlei Wang, Yirui He, Yang Wu, Ruixi Zou, Yuheng Cheng, Yulu Xie, Wenxuan Liu, Huan Zhao, Yan Xu, Jinjin Gu, and Junhua Zhao. Engibench: A benchmark for evaluating large language models on engineering problem solving, 2025

work page 2025
[56]

/ scratch / ShapeEvolve

Ciyou Zhu, Richard H Byrd, Peihuang Lu, and Jorge Nocedal. Algorithm 778: L-bfgs-b: Fortran subroutines for large-scale bound-constrained optimization.ACM Transactions on mathematical software (TOMS), 23(4):550–560, 1997. 13 Appendix Table of Contents A More Information on This Project 16 A.1 Licenses . . . . . . . . . . . . . . . . . . . . . . . . . . . ...

work page 1997
[57]

Transolver [51] is a neural operator architecture and one of the models used within DrivAerStar

are the inclusion of vehicle features such as engine bays, cooling systems, and internal airflow as well as greater wind tunnel validation accuracy errors of ∼1 % compared to the >5 % typical values for the previous studies. Transolver [51] is a neural operator architecture and one of the models used within DrivAerStar. In validation tests, the Transolver...

work page
[58]

FFD-based geometric morphing in Blender

work page
[59]

Yehudi break

Surrogate evaluation with Transolver 30 The surrogate achieves a total mean absolute percentage error (MAPE) of 2.422%, with per-style MAPEs of2.633% for E,2.195% for F, and2.437% for N. C.4NeuralFoil NeuralFoil [41, 42], when combined with the extension AeroSandbox, is a surrogate tool for rapid analysis of airfoils that can provide the aerodynamics for ...

work page

[1] [1]

Accelerating materials design via llm-guided evolutionary search.arXiv preprint arXiv:2510.22503, 2025

Nikhil Abhyankar, Sanchit Kabra, Saaketh Desai, and Chandan K Reddy. Accelerating materials design via llm-guided evolutionary search.arXiv preprint arXiv:2510.22503, 2025

work page arXiv 2025

[2] [2]

Aerodynamic shape optimization benchmarks with error control and automatic parameterization

George R Anderson, Marian Nemec, and Michael J Aftosmis. Aerodynamic shape optimization benchmarks with error control and automatic parameterization. In53rd AIAA Aerospace Sciences Meeting, page 1719, 2015

work page 2015

[3] [3]

Casadi—a soft- ware framework for nonlinear optimization and optimal control.Mathematical Programming Computation, 11(1):1–36, 2018

Joel Andersson, Joris Gillis, Greg Horn, Jim Rawlings, and Moritz Diehl. Casadi—a soft- ware framework for nonlinear optimization and optimal control.Mathematical Programming Computation, 11(1):1–36, 2018

work page 2018

[4] [4]

Ashton, N., Mockett, C., Fuchs, M., Fliessbach, L., Het- mann, H., Knacke, T., Schonwald, N., Skaperdas, V ., Fotiadis, G., Walle, A., et al

Neil Ashton, Charles Mockett, Marian Fuchs, Louis Fliessbach, Hendrik Hetmann, Thilo Knacke, Norbert Schonwald, Vangelis Skaperdas, Grigoris Fotiadis, Astrid Walle, et al. Dri- vAerML: High-fidelity computational fluid dynamics dataset for road-car external aerodynamics. arXiv preprint arXiv:2408.11969, 2024

work page arXiv 2024

[5] [5]

Botorch: A framework for efficient monte-carlo bayesian optimization

Maximilian Balandat, Brian Karrer, Daniel Jiang, Samuel Daulton, Ben Letham, Andrew G Wil- son, and Eytan Bakshy. Botorch: A framework for efficient monte-carlo bayesian optimization. Advances in neural information processing systems, 33:21524–21538, 2020

work page 2020

[6] [6]

Zilliac, Jim S

Jennifer Dacles-Mariani, Gregory G. Zilliac, Jim S. Chow, and Peter Bradshaw. Numerical/- experimental study of a wingtip vortex in the near field.AIAA Journal, 33(9):1561–1568, 1995

work page 1995

[7] [7]

From fast to fast-oad: An open source framework for rapid overall aircraft design

Christophe David, Scott Delbecq, Sebastien Defoort, Peter Schmollgruber, Emmanuel Benard, and Valerie Pommier-Budinger. From fast to fast-oad: An open source framework for rapid overall aircraft design. InIOP Conference Series: Materials Science and Engineering, volume 1024, page 012062. IOP Publishing, 2021

work page 2021

[8] [8]

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, et al. An image is worth 16x16 words: Transformers for image recognition at scale.arXiv preprint arXiv:2010.11929, 2020

work page internal anchor Pith review Pith/arXiv arXiv 2010

[9] [9]

Pros & cons of airfoil optimization

Mark Drela. Pros & cons of airfoil optimization. InFrontiers of Computational Fluid Dynamics 1998, pages 363–381. World Scientific, 1998

work page 1998

[10] [10]

Drivaernet++: A large- scale multimodal car dataset with computational fluid dynamics simulations and deep learning benchmarks, 2025

Mohamed Elrefaie, Florin Morar, Angela Dai, and Faez Ahmed. Drivaernet++: A large- scale multimodal car dataset with computational fluid dynamics simulations and deep learning benchmarks, 2025

work page 2025

[11] [11]

John Wiley & Sons, 2008

Alexander Forrester, Andras Sobester, and Andy Keane.Engineering design via surrogate modelling: a practical guide. John Wiley & Sons, 2008

work page 2008

[12] [12]

A Tutorial on Bayesian Optimization

Peter I Frazier. A tutorial on bayesian optimization.arXiv preprint arXiv:1807.02811, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018

[13] [13]

Design of optimal aerodynamic shapes using stochastic optimization methods and computational intelligence.Progress in Aerospace sciences, 38(1):43–76, 2002

KC Giannakoglou. Design of optimal aerodynamic shapes using stochastic optimization methods and computational intelligence.Progress in Aerospace sciences, 38(1):43–76, 2002

work page 2002

[14] [14]

An introduction to the adjoint approach to design.Flow, turbulence and combustion, 65(3):393–415, 2000

Michael B Giles and Niles A Pierce. An introduction to the adjoint approach to design.Flow, turbulence and combustion, 65(3):393–415, 2000

work page 2000

[15] [15]

Cma-es/pycma on github, 2019

Nikolaus Hansen, Youhei Akimoto, and Petr Baudis. Cma-es/pycma on github, 2019. 10

work page 2019

[16] [16]

Coco: A platform for comparing continuous optimizers in a black-box setting.Optimization Methods and Software, 36(1):114–144, 2021

Nikolaus Hansen, Anne Auger, Raymond Ros, Olaf Mersmann, Tea Tušar, and Dimo Brockhoff. Coco: A platform for comparing continuous optimizers in a black-box setting.Optimization Methods and Software, 36(1):114–144, 2021

work page 2021

[17] [17]

Completely derandomized self-adaptation in evolu- tion strategies.Evolutionary computation, 9(2):159–195, 2001

Nikolaus Hansen and Andreas Ostermeier. Completely derandomized self-adaptation in evolu- tion strategies.Evolutionary computation, 9(2):159–195, 2001

work page 2001

[18] [18]

direct search

Robert Hooke and Terry A Jeeves. “direct search”solution of numerical and statistical problems. Journal of the ACM (JACM), 8(2):212–229, 1961

work page 1961

[19] [19]

Aerodynamic design via control theory.Journal of scientific computing, 3(3):233–260, 1988

Antony Jameson. Aerodynamic design via control theory.Journal of scientific computing, 3(3):233–260, 1988

work page 1988

[20] [20]

Particle swarm optimization

James Kennedy and Russell Eberhart. Particle swarm optimization. InProceedings of ICNN’95- international conference on neural networks, volume 4, pages 1942–1948. ieee, 1995

work page 1942

[21] [21]

Aerodynamic shape optimization of the crm configuration including buffet-onset conditions

Gaetan K Kenway and Joaquim RRA Martins. Aerodynamic shape optimization of the crm configuration including buffet-onset conditions. In54th AIAA Aerospace Sciences Meeting, page 1294, 2016

work page 2016

[22] [22]

Adam: A Method for Stochastic Optimization

Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization.arXiv preprint arXiv:1412.6980, 2014

work page internal anchor Pith review Pith/arXiv arXiv 2014

[23] [23]

Universal parametric geometry representation method.Journal of aircraft, 45(1):142–158, 2008

Brenda M Kulfan. Universal parametric geometry representation method.Journal of aircraft, 45(1):142–158, 2008

work page 2008

[24] [24]

Shinkaevolve: Towards open-ended and sample-efficient program evolution, 2025

Robert Tjarko Lange, Yuki Imajuku, and Edoardo Cetin. Shinkaevolve: Towards open-ended and sample-efficient program evolution, 2025

work page 2025

[25] [25]

Study based on the aiaa aero- dynamic design optimization discussion group test cases.AIAA Journal, 53(7):1910–1935, 2015

Stephen T LeDoux, John C Vassberg, David P Young, Spencer Fugal, Dmitry Kamenetskiy, William P Huffman, Robin G Melvin, and Matthew F Smith. Study based on the aiaa aero- dynamic design optimization discussion group test cases.AIAA Journal, 53(7):1910–1935, 2015

work page 1910

[26] [26]

Solving quantitative reasoning problems with language models, 2022

Aitor Lewkowycz, Anders Andreassen, David Dohan, Ethan Dyer, Henryk Michalewski, Vinay Ramasesh, Ambrose Slone, Cem Anil, Imanol Schlag, Theo Gutman-Solo, Yuhuai Wu, Behnam Neyshabur, Guy Gur-Ari, and Vedant Misra. Solving quantitative reasoning problems with language models, 2022

work page 2022

[27] [27]

Afbench: A large-scale benchmark for airfoil design

Jian Liu, Jianyu Wu, Hairun Xie, Guoqing Zhang, Jing Wang, Wei Liu, Wanli Ouyang, Junjun Jiang, Xianming Liu, Shixiang Tang, et al. Afbench: A large-scale benchmark for airfoil design. Advances in Neural Information Processing Systems, 37:82757–82780, 2024

work page 2024

[28] [28]

Decoupled Weight Decay Regularization

Ilya Loshchilov and Frank Hutter. Decoupled weight decay regularization.arXiv preprint arXiv:1711.05101, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017

[29] [29]

Martins and Andrew B

Joaquim R.R.A. Martins and Andrew B. Lambe. Multidisciplinary design optimization: A survey of architectures.AIAA Journal, 51:2049–2075, 2013

work page 2049

[30] [30]

MeshLib: Mesh Processing Library

MeshInspector. MeshLib: Mesh Processing Library. https://github.com/MeshInspector/ MeshLib, 2025. Version 3.0.9.196, accessed 2026-04-30

work page 2025

[31] [31]

ntop (release 4.1), 2025

nTop Inc. ntop (release 4.1), 2025. Computational design software

work page 2025

[32] [32]

Improving reproducibility in machine learning research (a report from the neurips 2019 reproducibility program), 2020

Joelle Pineau, Philippe Vincent-Lamarre, Koustuv Sinha, Vincent Larivière, Alina Beygelzimer, Florence d’Alché Buc, Emily Fox, and Hugo Larochelle. Improving reproducibility in machine learning research (a report from the neurips 2019 reproducibility program), 2020

work page 2019

[33] [33]

Drivaerstar: An industrial-grade cfd dataset for vehicle aerodynamic optimization.arXiv preprint arXiv:2510.16857, 2025

Jiyan Qiu, Lyulin Kuang, Guan Wang, Yichen Xu, Leiyao Cui, Shaotong Fu, Yixin Zhu, and Ruihua Zhang. Drivaerstar: An industrial-grade cfd dataset for vehicle aerodynamic optimization.arXiv preprint arXiv:2510.16857, 2025

work page arXiv 2025

[34] [34]

Surrogate-based analysis and optimization.Progress in aerospace sciences, 41(1):1–28, 2005

Nestor V Queipo, Raphael T Haftka, Wei Shyy, Tushar Goel, Rajkumar Vaidyanathan, and P Kevin Tucker. Surrogate-based analysis and optimization.Progress in aerospace sciences, 41(1):1–28, 2005. 11

work page 2005

[35] [35]

Mathematical discoveries from program search with large language models

Bernardino Romera-Paredes, Mohammadamin Barekatain, Alexander Novikov, Matej Balog, M Pawan Kumar, Emilien Dupont, Francisco JR Ruiz, Jordan S Ellenberg, Pengming Wang, Omar Fawzi, et al. Mathematical discoveries from program search with large language models. Nature, 625(7995):468–475, 2024

work page 2024

[36] [36]

Evolution Strategies as a Scalable Alternative to Reinforcement Learning

Tim Salimans, Jonathan Ho, Xi Chen, Szymon Sidor, and Ilya Sutskever. Evolution strategies as a scalable alternative to reinforcement learning.arXiv preprint arXiv:1703.03864, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017

[37] [37]

Marco Saporito, Andrea Da Ronch, Nathalie Bartoli, and Sébastien Defoort. Robust multidis- ciplinary analysis and optimization for conceptual design of flexible aircraft under dynamic aeroelastic constraints.Aerospace Science and Technology, 138:108349, 2023

work page 2023

[38] [38]

Bayesian optimization for mixed variables using an adaptive dimension reduction process: applications to aircraft design

Paul Saves, Nathalie Bartoli, Youssef Diouane, Thierry Lefebvre, Joseph Morlier, Christophe David, Eric Nguyen Van, and Sébastien Defoort. Bayesian optimization for mixed variables using an adaptive dimension reduction process: applications to aircraft design. InAIAA SciTech 2022 Forum, page 0082, 2022

work page 2022

[39] [39]

Multidisciplinary design optimization with mixed categorical variables for aircraft design

Paul Saves, Nathalie Bartoli, Youssef Diouane, Thierry Lefebvre, Joseph Morlier, Christophe David, Eric Nguyen Van, and Sébastien Defoort. Multidisciplinary design optimization with mixed categorical variables for aircraft design. InAIAA SCITECH 2022 Forum. American Institute of Aeronautics and Astronautics, January 2022

work page 2022

[40] [40]

Openevolve: Open-source implementation of alphaevolve, 2025

Asankhaya Sharma. Openevolve: Open-source implementation of alphaevolve, 2025. Accessed: 2026-05-07

work page 2025

[41] [41]

John Hansman

Peter Sharpe and R. John Hansman. Neuralfoil: An airfoil aerodynamics analysis tool using physics-informed machine learning, 2025

work page 2025

[42] [42]

Sharpe.Accelerating Practical Engineering Design Optimization with Computational Graph Transformations

Peter D. Sharpe.Accelerating Practical Engineering Design Optimization with Computational Graph Transformations. PhD thesis, Massachusetts Institute of Technology, 2024

work page 2024

[43] [43]

PhD thesis, 2021

Peter D Sharpe and R John Hansman.Aerosandbox: A differentiable framework for aircraft design optimization. PhD thesis, 2021

work page 2021

[44] [44]

Graph neural network-guided aerodynamic shape optimization for conceptual design of supersonic transport wings

Yiren Shen and Juan Alonso. Graph neural network-guided aerodynamic shape optimization for conceptual design of supersonic transport wings. InAIAA AVIATION FORUM AND ASCEND 2025, page 3228, 2025

work page 2025

[45] [45]

V ortexnet: A graph neural network-based multi-fidelity surrogate model for field predictions

Yiren Shen, Jacob T Needels, and Juan J Alonso. V ortexnet: A graph neural network-based multi-fidelity surrogate model for field predictions. InAIAA SciTech 2025 Forum, page 0494, 2025

work page 2025

[46] [46]

Jones, and Faez Ahmed

Nicholas Sung, Steven Spreizer, Mohamed Elrefaie, Matthew C. Jones, and Faez Ahmed. Blendednet++: A large-scale blended wing body aerodynamics dataset and benchmark, 2025

work page 2025

[47] [47]

Jones, and Faez Ahmed

Nicholas Sung, Steven Spreizer, Mohamed Elrefaie, Kaira Samuel, Matthew C. Jones, and Faez Ahmed. Blendednet: A blended wing body aircraft dataset and surrogate model for aerodynamic predictions. InVolume 3B: 51st Design Automation Conference (DAC), IDETC- CIE2025. American Society of Mechanical Engineers, August 2025

work page 2025

[48] [48]

Neural-solver-library: A library for advanced neural pde solvers

Tsinghua University Machine Learning Group (THUML). Neural-solver-library: A library for advanced neural pde solvers. https://github.com/thuml/Neural-Solver-Library, 2025. Last accessed: 2025-09-29

work page 2025

[49] [49]

On the implementation of an interior-point filter line-search algorithm for large-scale nonlinear programming.Mathematical programming, 106(1):25–57, 2006

Andreas Wächter and Lorenz T Biegler. On the implementation of an interior-point filter line-search algorithm for large-scale nonlinear programming.Mathematical programming, 106(1):25–57, 2006

work page 2006

[50] [50]

No free lunch theorems for optimization.IEEE transactions on evolutionary computation, 1(1):67–82, 2002

David H Wolpert and William G Macready. No free lunch theorems for optimization.IEEE transactions on evolutionary computation, 1(1):67–82, 2002

work page 2002

[51] [51]

Transolver: A fast transformer solver for pdes on general geometries

Haixu Wu, Huakun Luo, Haowen Wang, Jianmin Wang, and Mingsheng Long. Transolver: A fast transformer solver for pdes on general geometries. InInternational Conference on Machine Learning, 2024. 12

work page 2024

[52] [52]

Superwing: a comprehensive transonic wing dataset for data-driven aerodynamic design, 2025

Yunjia Yang, Weishao Tang, Mengxin Liu, Nils Thuerey, Yufei Zhang, and Haixin Chen. Superwing: a comprehensive transonic wing dataset for data-driven aerodynamic design, 2025

work page 2025

[53] [53]

Using large language models for parametric shape opti- mization, 2024

Xinxin Zhang, Zhuoqun Xu, Guangpu Zhu, Chien Ming Jonathan Tay, Yongdong Cui, Boo Cheong Khoo, and Lailai Zhu. Using large language models for parametric shape opti- mization, 2024

work page 2024

[54] [54]

Evo- lutionary optimization methods for high-dimensional expensive problems: A survey.IEEE/CAA Journal of Automatica Sinica, 11(5):1092–1105, 2024

MengChu Zhou, Meiji Cui, Dian Xu, Shuwei Zhu, Ziyan Zhao, and Abdullah Abusorrah. Evo- lutionary optimization methods for high-dimensional expensive problems: A survey.IEEE/CAA Journal of Automatica Sinica, 11(5):1092–1105, 2024

work page 2024

[55] [55]

Engibench: A benchmark for evaluating large language models on engineering problem solving, 2025

Xiyuan Zhou, Xinlei Wang, Yirui He, Yang Wu, Ruixi Zou, Yuheng Cheng, Yulu Xie, Wenxuan Liu, Huan Zhao, Yan Xu, Jinjin Gu, and Junhua Zhao. Engibench: A benchmark for evaluating large language models on engineering problem solving, 2025

work page 2025

[56] [56]

/ scratch / ShapeEvolve

Ciyou Zhu, Richard H Byrd, Peihuang Lu, and Jorge Nocedal. Algorithm 778: L-bfgs-b: Fortran subroutines for large-scale bound-constrained optimization.ACM Transactions on mathematical software (TOMS), 23(4):550–560, 1997. 13 Appendix Table of Contents A More Information on This Project 16 A.1 Licenses . . . . . . . . . . . . . . . . . . . . . . . . . . . ...

work page 1997

[57] [57]

Transolver [51] is a neural operator architecture and one of the models used within DrivAerStar

are the inclusion of vehicle features such as engine bays, cooling systems, and internal airflow as well as greater wind tunnel validation accuracy errors of ∼1 % compared to the >5 % typical values for the previous studies. Transolver [51] is a neural operator architecture and one of the models used within DrivAerStar. In validation tests, the Transolver...

work page

[58] [58]

FFD-based geometric morphing in Blender

work page

[59] [59]

Yehudi break

Surrogate evaluation with Transolver 30 The surrogate achieves a total mean absolute percentage error (MAPE) of 2.422%, with per-style MAPEs of2.633% for E,2.195% for F, and2.437% for N. C.4NeuralFoil NeuralFoil [41, 42], when combined with the extension AeroSandbox, is a surrogate tool for rapid analysis of airfoils that can provide the aerodynamics for ...

work page