Mathematical perspective on genetic algorithms with optimization guided operators

Anna Brandenberger; Elchanan Mossel; Ilan Doron-Arad

arxiv: 2606.12279 · v1 · pith:JTS5M4DDnew · submitted 2026-06-10 · 💻 cs.NE · cs.AI· cs.LG

Mathematical perspective on genetic algorithms with optimization guided operators

Anna Brandenberger , Ilan Doron-Arad , Elchanan Mossel This is my paper

Pith reviewed 2026-06-27 07:32 UTC · model grok-4.3

classification 💻 cs.NE cs.AIcs.LG

keywords genetic algorithmsquery complexityoptimization operatorsmutationrecombinationdiversityreinforcement learning

0 comments

The pith

Some optimization problems require generation, mutation, and recombination to be solved.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper builds a model of genetic algorithms in which both mutation and recombination are performed by an auxiliary optimizer that aims to improve an objective, rather than by random changes. It casts the overall search process as a query-complexity problem expressed in the language of reinforcement learning. The main finding is that certain problems cannot be solved unless the algorithm is allowed to use initial generation, guided mutation, and guided recombination together. The work then supplies matching upper and lower bounds for a concrete family of problems whose solution pools must maintain nontrivial diversity to reach the claimed efficiency.

Core claim

In the general model of genetic algorithms formulated as a reinforcement-learning query-complexity problem, some optimization problems require generation, mutation, and recombination to be solved. Qualitatively tight algorithms are obtained for a family of problems within this framework that captures the nontrivial role of diversity in the solution pool.

What carries the argument

The specialized family of problems that capture the nontrivial role of diversity in the solution pool under optimization-guided mutation and recombination.

If this is right

Some problems are unsolvable without access to all three operators.
For the studied family, the diversity requirement forces the joint use of mutation and recombination.
The guided operators incur higher per-step cost but allow qualitatively tighter overall bounds than random operators would.
The model distinguishes problems whose solution pools must retain multiple distinct candidates from those that do not.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Maintaining explicit diversity metrics during inference-time search may become a design parameter in ML systems that rely on guided genetic operators.
The query-complexity lens could be applied to other iterative methods that interleave expensive improvement steps with population management.
If the diversity requirement generalizes, training procedures that encourage solution variety inside a single run may improve downstream optimization performance.

Load-bearing premise

The specialized models capture the essential features of practical ML genetic algorithms, particularly the role of diversity.

What would settle it

An explicit optimization problem on which an algorithm using only one or two of the three operators achieves the same query complexity that the paper claims requires all three.

Figures

Figures reproduced from arXiv: 2606.12279 by Anna Brandenberger, Elchanan Mossel, Ilan Doron-Arad.

**Figure 1.** Figure 1: Illustration of the two-dimensional construction. Generation provides the (green) points (0, 0) and (1, 0); recombination (blue) is midpoint recombination, allowing one to scan the x-axis; mutation (red) lifts these locations toward height 1/2. The bold red point indicates the unique peak of the hidden objective, while the other faint red points illustrate intermediate dyadic locations. Proof sketch. Gen… view at source ↗

read the original abstract

Recent work in ML applies genetic algorithms at inference time to iteratively improve solutions to optimization problems. The basic mutation and recombination operators involved are qualitatively different from those studied classically. Mutations are no longer random; an ML algorithm mutates a solution with the goal of improving an objective. Similarly, recombination is not based on random collages of parent solutions. Instead, it is an ML optimization-based operator whose goal is to synthesize improved solutions from its inputs. Thus, these mutation and recombination operators are more likely to improve the objective, but their computational cost is much higher. We introduce a general model of genetic algorithms and formulating optimization in this model as a query-complexity problem, using the language of reinforcement learning. We then study specialized models. We show that some optimization problems require generation, mutation, and recombination to be solved. We then obtain qualitatively tight algorithms for a family of problems within this framework that captures the nontrivial role of diversity in the solution pool, a key feature of practical ML genetic algorithms.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper introduces a query-complexity model treating guided mutation and recombination as costly queries, proves some problems require all three operators, and gives tight algorithms for a diversity-focused family.

read the letter

This paper sets up a query-complexity model for genetic algorithms where mutation and recombination count as expensive optimization queries rather than cheap random steps. It shows that some problems require generation, mutation, and recombination together, then gives qualitatively tight algorithms for a family of problems that make the role of diversity in the solution pool explicit.

The main contribution is the general model framed in reinforcement learning terms, plus the specialized cases where they can prove both necessity and matching upper bounds. This formalizes the shift from classical random operators to the guided ones used in recent ML inference-time work. The abstract keeps the claims scoped to the models they define, which is the right move.

The results appear internally consistent on the terms given, with no visible circularity or unsupported leaps in the high-level description. The modeling choice to charge for the guided operators is stated up front.

A soft spot is how well the chosen specialized models reflect the actual behavior of ML genetic algorithms in practice, particularly around diversity maintenance. That is the usual limit of this kind of theoretical abstraction, and the paper seems to be targeting exactly that feature rather than claiming broader coverage.

This is for theorists working on evolutionary algorithms, query complexity, or optimization foundations. A reader looking for a clean way to reason about when different operators become necessary will find something usable here. I would send it to peer review; the model is new enough and the results are scoped tightly enough to merit referee time.

Referee Report

2 major / 2 minor

Summary. The paper introduces a query-complexity model of genetic algorithms in which generation, mutation, and recombination are treated as costly optimization queries (formulated via reinforcement learning). It proves that certain optimization problems require all three operators to be solved, then derives qualitatively tight algorithms for a family of problems that capture the role of diversity in the solution pool.

Significance. If the central claims hold, the work supplies a formal query-complexity foundation for optimization-guided genetic operators used at inference time in ML, with explicit lower bounds showing necessity of recombination and diversity-preserving mechanisms. The emphasis on falsifiable query bounds and the explicit modeling of operator cost are strengths that distinguish this from heuristic GA literature.

major comments (2)

[§4.2, Theorem 4.3] §4.2, Theorem 4.3: the lower-bound construction showing that generation+mutation alone are insufficient appears to rely on an adversarial objective whose query cost is defined only after the fact; confirm that the reduction does not inadvertently allow the mutation operator to simulate recombination at the stated cost.
[§5.1, Algorithm 1 and Theorem 5.4] §5.1, Algorithm 1 and Theorem 5.4: the claimed O(·) upper bound for the diversity family is shown to be tight only under the assumption that the diversity measure is exactly the one in Definition 3.2; the paper should state whether the tightness carries over to other common diversity metrics used in practical ML GAs.

minor comments (2)

[§3] Notation for query cost (e.g., the function C(·) in §3) is introduced without an explicit comparison to the classical random-operator model; a short table would clarify the distinction.
[Introduction] The abstract states that the specialized models 'capture the nontrivial role of diversity'; the introduction should cite the precise practical ML GA papers whose diversity mechanisms are being abstracted.

Simulated Author's Rebuttal

2 responses · 0 unresolved

Thank you for the referee's positive evaluation and constructive comments. We address each major comment below with clarifications and proposed revisions.

read point-by-point responses

Referee: [§4.2, Theorem 4.3] §4.2, Theorem 4.3: the lower-bound construction showing that generation+mutation alone are insufficient appears to rely on an adversarial objective whose query cost is defined only after the fact; confirm that the reduction does not inadvertently allow the mutation operator to simulate recombination at the stated cost.

Authors: The lower-bound construction in Theorem 4.3 defines query costs a priori according to the reinforcement-learning formulation, with costs fixed by the problem structure before any algorithm runs. Recombination is modeled as a distinct query type that jointly optimizes over multiple parent solutions at a combined cost lower than sequential single-solution mutations would require. The proof separates the operators by the information each query type can extract from the solution pool; mutation cannot replicate recombination's effect without additional queries that exceed the stated bound. We will add a clarifying paragraph in §4.2 emphasizing that costs are predefined and independent of algorithm execution. revision: partial
Referee: [§5.1, Algorithm 1 and Theorem 5.4] §5.1, Algorithm 1 and Theorem 5.4: the claimed O(·) upper bound for the diversity family is shown to be tight only under the assumption that the diversity measure is exactly the one in Definition 3.2; the paper should state whether the tightness carries over to other common diversity metrics used in practical ML GAs.

Authors: We agree that the tightness established in Theorem 5.4 holds specifically for the diversity measure of Definition 3.2. For other metrics (e.g., entropy-based or crowding-distance measures common in ML genetic algorithms), the same quantitative tightness need not carry over, although the algorithmic template of Algorithm 1 remains applicable. We will revise the text around Theorem 5.4 and the conclusion to state this dependence explicitly and identify extension to alternative metrics as future work. revision: yes

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper explicitly defines a new query-complexity model treating mutation and recombination as costly optimization queries, then proves lower bounds requiring all three operators and constructs matching algorithms for a specific family of problems that emphasize diversity. These steps are self-contained within the introduced formalism; no equations or claims reduce by construction to fitted parameters, self-citations, or renamed inputs. The modeling assumptions are stated upfront and the results are scoped to the specialized models, making the derivation independent of its own outputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Only abstract available; no specific free parameters or invented entities identifiable.

axioms (1)

standard math Standard mathematical assumptions in query complexity and reinforcement learning models.
The paper uses RL language for query complexity.

pith-pipeline@v0.9.1-grok · 5707 in / 862 out tokens · 18271 ms · 2026-06-27T07:32:50.174400+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

59 extracted references · 15 canonical work pages · 11 internal anchors

[1]

Statistics and computing , volume=

Genetic programming as a means for programming computers by natural selection , author=. Statistics and computing , volume=. 1994 , publisher=

1994
[2]

1981 , year=

Schwefel, Hans-Paul , title =. 1981 , year=

1981
[3]

Journal of Global Optimization , volume =

Storn, Rainer and Price, Kenneth , title =. Journal of Global Optimization , volume =
[4]

Evolutionary Algorithms in Theory and Practice , year =

B. Evolutionary Algorithms in Theory and Practice , year =
[5]

and Smith, James E

Eiben, Agosten E. and Smith, James E. , title =. 2015 , publisher =

2015
[6]

Handbook of Heuristics , pages=

Estimation of distribution algorithms , author=. Handbook of Heuristics , pages=. 2025 , publisher=

2025
[7]

and Lingle, Robert , title =

Goldberg, David E. and Lingle, Robert , title =. Proceedings of the First International Conference on Genetic Algorithms , year =
[8]

Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI) , year =

Davis, Lawrence , title =. Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI) , year =
[9]

Proceedings of the Third International Conference on Genetic Algorithms , volume=

Syswerda, Gilbert , title =. Proceedings of the Third International Conference on Genetic Algorithms , volume=
[10]

and Schaffer, J

Eshelman, Larry J. and Schaffer, J. David , title =. Foundations of genetic algorithms , volume=. 1993 , publisher=

1993
[11]

, title =

Thierens, Dirk and Goldberg, David E. , title =. Proceedings of the Fifth International Conference on Genetic Algorithms , pages=
[12]

, title =

Lehman, Joel and Stanley, Kenneth O. , title =. Evolutionary Computation , volume =
[13]

Illuminating search spaces by mapping elites

Mouret, Jean-Baptiste and Clune, Jeff , title =. arXiv preprint arXiv:1504.04909 , year =

work page internal anchor Pith review Pith/arXiv arXiv
[14]

and Soros, Lisa B

Pugh, Justin K. and Soros, Lisa B. and Stanley, Kenneth O. , title =. Frontiers in Robotics and AI , volume =
[15]

Eureka: Human-Level Reward Design via Coding Large Language Models

Eureka: Human-level reward design via coding large language models , author=. arXiv preprint arXiv:2310.12931 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[16]

Proceedings of the Sixth International Congress of Genetics , volume=

The roles of mutation, inbreeding, crossbreeding, and selection in evolution , author=. Proceedings of the Sixth International Congress of Genetics , volume=
[17]

Spin glasses and biology , pages=

The Origins of Order: Self-Organization and Selection in Evolution , author=. Spin glasses and biology , pages=. 1992 , publisher=

1992
[18]

Recent advances in the theory and application of fitness landscapes , pages=

Fitness landscapes and problem difficulty in evolutionary algorithms: from theory to applications , author=. Recent advances in the theory and application of fitness landscapes , pages=. 2014 , publisher=

2014
[19]

Nature , volume=

Mathematical discoveries from program search with large language models , author=. Nature , volume=. 2024 , publisher=

2024
[20]

Reinforced Generation of Combinatorial Structures: Ramsey Numbers

Reinforced generation of combinatorial structures: Ramsey numbers , author=. arXiv preprint arXiv:2603.09172 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[21]

arXiv preprint arXiv:2411.00566 , year=

Patternboost: Constructions in mathematics with a little help from ai , author=. arXiv preprint arXiv:2411.00566 , year=

work page arXiv
[22]

arXiv preprint arXiv:2601.18005 , year=

Flow-based Extremal Mathematical Structure Discovery , author=. arXiv preprint arXiv:2601.18005 , year=

work page arXiv
[23]

arXiv preprint arXiv:2507.04034 , year=

Lyria: A General LLM-Driven Genetic Algorithm Framework for Problem Solving , author=. arXiv preprint arXiv:2507.04034 , year=

work page arXiv
[24]

Handbook of evolutionary machine learning , pages=

Evolution through large models , author=. Handbook of evolutionary machine learning , pages=. 2023 , publisher=

2023
[25]

ACM Transactions on Evolutionary Learning , volume=

Language model crossover: Variation through few-shot prompting , author=. ACM Transactions on Evolutionary Learning , volume=. 2024 , publisher=

2024
[26]

EvoPrompt: Connecting LLMs with Evolutionary Algorithms Yields Powerful Prompt Optimizers

Connecting large language models with evolutionary algorithms yields powerful prompt optimizers , author=. arXiv preprint arXiv:2309.08532 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[27]

Optimierung technischer Systeme nach Prinzipien der biologischen Evolution , year=

Evolutionsstrategie , author=. Optimierung technischer Systeme nach Prinzipien der biologischen Evolution , year=
[28]

AlphaEvolve: A coding agent for scientific and algorithmic discovery

Novikov, Alexander and V. arXiv preprint arXiv:2506.13131 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[29]

, author=

Genetic Prompt Search via Exploiting Language Model Probabilities. , author=. IJCAI , pages=
[30]

1992 , publisher=

Adaptation in natural and artificial systems: an introductory analysis with applications to biology, control, and artificial intelligence , author=. 1992 , publisher=

1992
[31]

1989 , publisher=

Genetic Algorithms in Search, Optimization, and Machine Learning , author=. 1989 , publisher=

1989
[32]

1977 , publisher=

Numerische Optimierung von Computer-Modellen mittels der Evolutionsstrategie , author=. 1977 , publisher=

1977
[33]

Evolutionary computation , volume=

Completely derandomized self-adaptation in evolution strategies , author=. Evolutionary computation , volume=. 2001 , publisher=

2001
[34]

Evolutionary Computation , volume=

Evolving Neural Networks through Augmenting Topologies , author=. Evolutionary Computation , volume=
[35]

Evolution Strategies as a Scalable Alternative to Reinforcement Learning

Evolution Strategies as a Scalable Alternative to Reinforcement Learning , author=. arXiv preprint arXiv:1703.03864 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[36]

Deep Neuroevolution: Genetic Algorithms Are a Competitive Alternative for Training Deep Neural Networks for Reinforcement Learning

Deep Neuroevolution: Genetic Algorithms are a Competitive Alternative for Training Deep Neural Networks for Reinforcement Learning , author=. arXiv preprint arXiv:1712.06567 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[37]

Proceedings of the AAAI Conference on Artificial Intelligence , volume=

Regularized Evolution for Image Classifier Architecture Search , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=
[38]

Population Based Training of Neural Networks

Population Based Training of Neural Networks , author=. arXiv preprint arXiv:1711.09846 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[39]

Nature Machine Intelligence , volume=

Evolutionary Optimization of Model Merging Recipes , author=. Nature Machine Intelligence , volume=
[40]

arXiv preprint arXiv:2509.24372 , year=

Evolution strategies at scale: Llm fine-tuning beyond reinforcement learning , author=. arXiv preprint arXiv:2509.24372 , year=

work page arXiv
[41]

1994 , publisher=

Markov Decision Processes: Discrete Stochastic Dynamic Programming , author=. 1994 , publisher=

1994
[42]

1998 , publisher=

Reinforcement Learning: An Introduction , author=. 1998 , publisher=

1998
[43]

Artificial Intelligence , volume=

Planning and Acting in Partially Observable Stochastic Domains , author=. Artificial Intelligence , volume=. 1998 , publisher=

1998
[44]

Foundations and Trends in Machine Learning , volume=

Bayesian Reinforcement Learning: A Survey , author=. Foundations and Trends in Machine Learning , volume=. 2015 , publisher=

2015
[45]

Foundations and Trends in Machine Learning , volume=

Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems , author=. Foundations and Trends in Machine Learning , volume=
[46]

The Annals of Mathematical Statistics , volume=

A Stochastic Approximation Method , author=. The Annals of Mathematical Statistics , volume=
[47]

Joint European Conference on Machine Learning and Knowledge Discovery in Databases , pages=

State-dependent exploration for policy gradient methods , author=. Joint European Conference on Machine Learning and Knowledge Discovery in Databases , pages=. 2008 , organization=

2008
[48]

Neural Networks , volume=

Parameter-Exploring Policy Gradients , author=. Neural Networks , volume=
[49]

Paladyn , volume=

Exploring Parameter Space in Reinforcement Learning , author=. Paladyn , volume=
[50]

International Conference on Learning Representations , year=

Parameter Space Noise for Exploration , author=. International Conference on Learning Representations , year=
[51]

The 22nd International Conference on Artificial Intelligence and Statistics , pages=

Contrasting exploration in parameter and action space: A zeroth-order optimization perspective , author=. The 22nd International Conference on Artificial Intelligence and Statistics , pages=. 2019 , organization=

2019
[52]

Proximal Policy Optimization Algorithms

Proximal policy optimization algorithms , author=. arXiv preprint arXiv:1707.06347 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[53]

Advances in neural information processing systems , volume=

Training language models to follow instructions with human feedback , author=. Advances in neural information processing systems , volume=
[54]

Promptbreeder: Self-Referential Self-Improvement Via Prompt Evolution

Promptbreeder: Self-referential self-improvement via prompt evolution , author=. arXiv preprint arXiv:2309.16797 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[55]

Evolving deeper

Lee, Kuang-Huei and Fischer, Ian and Wu, Yueh-Hua and Marwood, Dave and Baluja, Shumeet and Schuurmans, Dale and Chen, Xinyun , journal=. Evolving deeper
[56]

Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback

Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback , author=. arXiv preprint arXiv:2204.05862 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[57]

Shao, Zhihong and Wang, Peiyi and Zhu, Qihao and Xu, Runxin and Song, Junxiao and Bi, Xiao and Zhang, Haotian and Zhang, Mingchuan and Li, Y. K. and Wu, Yuxiang and Guo, Daya , journal=
[58]

Guo, Daya and Yang, Dejian and Zhang, Haowei and Song, Junxiao and Wang, Peiyi and Zhu, Qihao and Xu, Runxin and Zhang, Ruoyu and Ma, Shirong and Bi, Xiao and others , journal=
[59]

Journal of the ACM (JACM) , volume=

Fast learning requires good memory: A time-space lower bound for parity learning , author=. Journal of the ACM (JACM) , volume=. 2018 , publisher=

2018

[1] [1]

Statistics and computing , volume=

Genetic programming as a means for programming computers by natural selection , author=. Statistics and computing , volume=. 1994 , publisher=

1994

[2] [2]

1981 , year=

Schwefel, Hans-Paul , title =. 1981 , year=

1981

[3] [3]

Journal of Global Optimization , volume =

Storn, Rainer and Price, Kenneth , title =. Journal of Global Optimization , volume =

[4] [4]

Evolutionary Algorithms in Theory and Practice , year =

B. Evolutionary Algorithms in Theory and Practice , year =

[5] [5]

and Smith, James E

Eiben, Agosten E. and Smith, James E. , title =. 2015 , publisher =

2015

[6] [6]

Handbook of Heuristics , pages=

Estimation of distribution algorithms , author=. Handbook of Heuristics , pages=. 2025 , publisher=

2025

[7] [7]

and Lingle, Robert , title =

Goldberg, David E. and Lingle, Robert , title =. Proceedings of the First International Conference on Genetic Algorithms , year =

[8] [8]

Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI) , year =

Davis, Lawrence , title =. Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI) , year =

[9] [9]

Proceedings of the Third International Conference on Genetic Algorithms , volume=

Syswerda, Gilbert , title =. Proceedings of the Third International Conference on Genetic Algorithms , volume=

[10] [10]

and Schaffer, J

Eshelman, Larry J. and Schaffer, J. David , title =. Foundations of genetic algorithms , volume=. 1993 , publisher=

1993

[11] [11]

, title =

Thierens, Dirk and Goldberg, David E. , title =. Proceedings of the Fifth International Conference on Genetic Algorithms , pages=

[12] [12]

, title =

Lehman, Joel and Stanley, Kenneth O. , title =. Evolutionary Computation , volume =

[13] [13]

Illuminating search spaces by mapping elites

Mouret, Jean-Baptiste and Clune, Jeff , title =. arXiv preprint arXiv:1504.04909 , year =

work page internal anchor Pith review Pith/arXiv arXiv

[14] [14]

and Soros, Lisa B

Pugh, Justin K. and Soros, Lisa B. and Stanley, Kenneth O. , title =. Frontiers in Robotics and AI , volume =

[15] [15]

Eureka: Human-Level Reward Design via Coding Large Language Models

Eureka: Human-level reward design via coding large language models , author=. arXiv preprint arXiv:2310.12931 , year=

work page internal anchor Pith review Pith/arXiv arXiv

[16] [16]

Proceedings of the Sixth International Congress of Genetics , volume=

The roles of mutation, inbreeding, crossbreeding, and selection in evolution , author=. Proceedings of the Sixth International Congress of Genetics , volume=

[17] [17]

Spin glasses and biology , pages=

The Origins of Order: Self-Organization and Selection in Evolution , author=. Spin glasses and biology , pages=. 1992 , publisher=

1992

[18] [18]

Recent advances in the theory and application of fitness landscapes , pages=

Fitness landscapes and problem difficulty in evolutionary algorithms: from theory to applications , author=. Recent advances in the theory and application of fitness landscapes , pages=. 2014 , publisher=

2014

[19] [19]

Nature , volume=

Mathematical discoveries from program search with large language models , author=. Nature , volume=. 2024 , publisher=

2024

[20] [20]

Reinforced Generation of Combinatorial Structures: Ramsey Numbers

Reinforced generation of combinatorial structures: Ramsey numbers , author=. arXiv preprint arXiv:2603.09172 , year=

work page internal anchor Pith review Pith/arXiv arXiv

[21] [21]

arXiv preprint arXiv:2411.00566 , year=

Patternboost: Constructions in mathematics with a little help from ai , author=. arXiv preprint arXiv:2411.00566 , year=

work page arXiv

[22] [22]

arXiv preprint arXiv:2601.18005 , year=

Flow-based Extremal Mathematical Structure Discovery , author=. arXiv preprint arXiv:2601.18005 , year=

work page arXiv

[23] [23]

arXiv preprint arXiv:2507.04034 , year=

Lyria: A General LLM-Driven Genetic Algorithm Framework for Problem Solving , author=. arXiv preprint arXiv:2507.04034 , year=

work page arXiv

[24] [24]

Handbook of evolutionary machine learning , pages=

Evolution through large models , author=. Handbook of evolutionary machine learning , pages=. 2023 , publisher=

2023

[25] [25]

ACM Transactions on Evolutionary Learning , volume=

Language model crossover: Variation through few-shot prompting , author=. ACM Transactions on Evolutionary Learning , volume=. 2024 , publisher=

2024

[26] [26]

EvoPrompt: Connecting LLMs with Evolutionary Algorithms Yields Powerful Prompt Optimizers

Connecting large language models with evolutionary algorithms yields powerful prompt optimizers , author=. arXiv preprint arXiv:2309.08532 , year=

work page internal anchor Pith review Pith/arXiv arXiv

[27] [27]

Optimierung technischer Systeme nach Prinzipien der biologischen Evolution , year=

Evolutionsstrategie , author=. Optimierung technischer Systeme nach Prinzipien der biologischen Evolution , year=

[28] [28]

AlphaEvolve: A coding agent for scientific and algorithmic discovery

Novikov, Alexander and V. arXiv preprint arXiv:2506.13131 , year=

work page internal anchor Pith review Pith/arXiv arXiv

[29] [29]

, author=

Genetic Prompt Search via Exploiting Language Model Probabilities. , author=. IJCAI , pages=

[30] [30]

1992 , publisher=

Adaptation in natural and artificial systems: an introductory analysis with applications to biology, control, and artificial intelligence , author=. 1992 , publisher=

1992

[31] [31]

1989 , publisher=

Genetic Algorithms in Search, Optimization, and Machine Learning , author=. 1989 , publisher=

1989

[32] [32]

1977 , publisher=

Numerische Optimierung von Computer-Modellen mittels der Evolutionsstrategie , author=. 1977 , publisher=

1977

[33] [33]

Evolutionary computation , volume=

Completely derandomized self-adaptation in evolution strategies , author=. Evolutionary computation , volume=. 2001 , publisher=

2001

[34] [34]

Evolutionary Computation , volume=

Evolving Neural Networks through Augmenting Topologies , author=. Evolutionary Computation , volume=

[35] [35]

Evolution Strategies as a Scalable Alternative to Reinforcement Learning

Evolution Strategies as a Scalable Alternative to Reinforcement Learning , author=. arXiv preprint arXiv:1703.03864 , year=

work page internal anchor Pith review Pith/arXiv arXiv

[36] [36]

Deep Neuroevolution: Genetic Algorithms Are a Competitive Alternative for Training Deep Neural Networks for Reinforcement Learning

Deep Neuroevolution: Genetic Algorithms are a Competitive Alternative for Training Deep Neural Networks for Reinforcement Learning , author=. arXiv preprint arXiv:1712.06567 , year=

work page internal anchor Pith review Pith/arXiv arXiv

[37] [37]

Proceedings of the AAAI Conference on Artificial Intelligence , volume=

Regularized Evolution for Image Classifier Architecture Search , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=

[38] [38]

Population Based Training of Neural Networks

Population Based Training of Neural Networks , author=. arXiv preprint arXiv:1711.09846 , year=

work page internal anchor Pith review Pith/arXiv arXiv

[39] [39]

Nature Machine Intelligence , volume=

Evolutionary Optimization of Model Merging Recipes , author=. Nature Machine Intelligence , volume=

[40] [40]

arXiv preprint arXiv:2509.24372 , year=

Evolution strategies at scale: Llm fine-tuning beyond reinforcement learning , author=. arXiv preprint arXiv:2509.24372 , year=

work page arXiv

[41] [41]

1994 , publisher=

Markov Decision Processes: Discrete Stochastic Dynamic Programming , author=. 1994 , publisher=

1994

[42] [42]

1998 , publisher=

Reinforcement Learning: An Introduction , author=. 1998 , publisher=

1998

[43] [43]

Artificial Intelligence , volume=

Planning and Acting in Partially Observable Stochastic Domains , author=. Artificial Intelligence , volume=. 1998 , publisher=

1998

[44] [44]

Foundations and Trends in Machine Learning , volume=

Bayesian Reinforcement Learning: A Survey , author=. Foundations and Trends in Machine Learning , volume=. 2015 , publisher=

2015

[45] [45]

Foundations and Trends in Machine Learning , volume=

Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems , author=. Foundations and Trends in Machine Learning , volume=

[46] [46]

The Annals of Mathematical Statistics , volume=

A Stochastic Approximation Method , author=. The Annals of Mathematical Statistics , volume=

[47] [47]

Joint European Conference on Machine Learning and Knowledge Discovery in Databases , pages=

State-dependent exploration for policy gradient methods , author=. Joint European Conference on Machine Learning and Knowledge Discovery in Databases , pages=. 2008 , organization=

2008

[48] [48]

Neural Networks , volume=

Parameter-Exploring Policy Gradients , author=. Neural Networks , volume=

[49] [49]

Paladyn , volume=

Exploring Parameter Space in Reinforcement Learning , author=. Paladyn , volume=

[50] [50]

International Conference on Learning Representations , year=

Parameter Space Noise for Exploration , author=. International Conference on Learning Representations , year=

[51] [51]

The 22nd International Conference on Artificial Intelligence and Statistics , pages=

Contrasting exploration in parameter and action space: A zeroth-order optimization perspective , author=. The 22nd International Conference on Artificial Intelligence and Statistics , pages=. 2019 , organization=

2019

[52] [52]

Proximal Policy Optimization Algorithms

Proximal policy optimization algorithms , author=. arXiv preprint arXiv:1707.06347 , year=

work page internal anchor Pith review Pith/arXiv arXiv

[53] [53]

Advances in neural information processing systems , volume=

Training language models to follow instructions with human feedback , author=. Advances in neural information processing systems , volume=

[54] [54]

Promptbreeder: Self-Referential Self-Improvement Via Prompt Evolution

Promptbreeder: Self-referential self-improvement via prompt evolution , author=. arXiv preprint arXiv:2309.16797 , year=

work page internal anchor Pith review Pith/arXiv arXiv

[55] [55]

Evolving deeper

Lee, Kuang-Huei and Fischer, Ian and Wu, Yueh-Hua and Marwood, Dave and Baluja, Shumeet and Schuurmans, Dale and Chen, Xinyun , journal=. Evolving deeper

[56] [56]

Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback

Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback , author=. arXiv preprint arXiv:2204.05862 , year=

work page internal anchor Pith review Pith/arXiv arXiv

[57] [57]

Shao, Zhihong and Wang, Peiyi and Zhu, Qihao and Xu, Runxin and Song, Junxiao and Bi, Xiao and Zhang, Haotian and Zhang, Mingchuan and Li, Y. K. and Wu, Yuxiang and Guo, Daya , journal=

[58] [58]

Guo, Daya and Yang, Dejian and Zhang, Haowei and Song, Junxiao and Wang, Peiyi and Zhu, Qihao and Xu, Runxin and Zhang, Ruoyu and Ma, Shirong and Bi, Xiao and others , journal=

[59] [59]

Journal of the ACM (JACM) , volume=

Fast learning requires good memory: A time-space lower bound for parity learning , author=. Journal of the ACM (JACM) , volume=. 2018 , publisher=

2018