DEI: Diversity in Evolutionary Inference for Quality-Diversity Search

John Donaghy; Shikhar Rastogi

arxiv: 2605.27130 · v1 · pith:MAOSWKU4new · submitted 2026-05-26 · 💻 cs.LG · cs.AI

DEI: Diversity in Evolutionary Inference for Quality-Diversity Search

John Donaghy , Shikhar Rastogi This is my paper

Pith reviewed 2026-06-29 18:38 UTC · model grok-4.3

classification 💻 cs.LG cs.AI

keywords quality-diversity searchlarge language modelsheterogeneous ensemblesevolutionary searchCore Warmutation operatorsdistributed algorithmsDigital Red Queen

0 comments

The pith

Heterogeneous ensemble of four LLMs achieves 124 percent higher QD-Score and 28 percent higher coverage than single-model baseline at equal budget.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents DEI as a distributed Quality-Diversity search method that assigns distinct large language models to separate nodes and has them exchange solutions after each round. This setup uses the models' different creative priors as complementary sources of novelty while the sharing step adds cross-model adversarial pressure. On the Core War benchmark the four-node ensemble records substantially better merged-archive scores and cell coverage than a single-node run or a homogeneous multi-node run when total LLM calls are held fixed. A sympathetic reader would care because the result isolates model variety itself as a lever for improving evolutionary search performance without extra compute.

Core claim

DEI extends the Digital Red Queen framework by placing heterogeneous LLMs on peer nodes that communicate via non-blocking collective operations, sharing local optima at round ends to seed the next population. Each node's distinct inductive bias supplies behavioral novelty that homogeneous replication cannot. In Core War experiments a four-node ensemble (GPT-5.4-mini, Claude Sonnet 4.6, GPT-5.2, Claude Haiku 4.5) reaches a merged-archive QD-Score of 45.90 and 80.6 percent coverage versus 20.46 and 63.0 percent for the single-node baseline, and also beats equally budgeted homogeneous ensembles on score, coverage, and held-out solution generality across all four model families.

What carries the argument

Heterogeneous LLM ensemble with non-blocking collective solution sharing that treats each model's inductive bias as a complementary source of behavioral novelty and generates cross-model adversarial pressure.

If this is right

Heterogeneous ensembles outperform equally budgeted homogeneous ensembles on QD-Score, coverage, and held-out generality.
Model diversity, not parallelism alone, accounts for the observed gains.
Cross-model solution sharing creates adversarial pressure that improves solution robustness.
The approach yields measurable improvements on the Core War competitive-programming domain.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same bias-diversity principle could be tested in other QD domains to check whether the gains generalize beyond Core War.
Resource allocation in LLM-based search may shift toward spreading calls across model families rather than concentrating them.
Future work could measure whether deliberately selecting models for complementary biases produces larger lifts than random selection.

Load-bearing premise

The performance advantage arises from the distinct inductive biases of the different LLMs rather than from differences in raw capability, prompt details, or the mechanics of solution exchange.

What would settle it

An experiment that replaces the four distinct models with four copies of one model or with models deliberately matched for inductive bias while keeping total calls fixed would show no remaining advantage for the ensemble configuration.

read the original abstract

We present DEI: Diversity in Evolutionary Inference, a distributed Quality-Diversity (QD) search framework that assigns heterogeneous large language models (LLMs) as mutation operators across peer nodes communicating with non-blocking collective operations. Unlike homogeneous parallel search, which replicates a single model's inductive biases across all workers, DEI treats each LLM's distinct creative prior as a complementary source of behavioral novelty. Extending the Digital Red Queen framework with DEI, nodes share local optimal solutions at the end of each round to seed the next round's population. This creates cross-model adversarial pressure that drives robustness beyond intra-model self-play. Evaluated on the Core War domain, a competitive programming benchmark in which Redcode warrior programs battle inside a simulated machine, a four-node heterogeneous ensemble (GPT-5.4-mini, Claude Sonnet 4.6, GPT-5.2, and Claude Haiku 4.5) achieves 124 percent higher merged-archive QD-Score (45.90 vs. 20.46) and 28 percent higher coverage (80.6 percent vs. 63.0 percent of cells) than a single-node baseline at equal total LLM-call budget. The heterogeneous ensemble also outperforms an equally-budgeted homogeneous ensemble on QD-Score, coverage, and held-out solution generality across all four model families. These results provide the first empirical evidence that model diversity, not merely parallelism, is the key driver of gain in distributed LLM-based QD search.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Heterogeneous LLMs improve QD metrics on Core War over single and homogeneous baselines, but the paper does not isolate inductive bias diversity from other factors.

read the letter

The main takeaway is that a four-model mix (GPT-5.4-mini, Claude Sonnet 4.6, GPT-5.2, Claude Haiku 4.5) reaches 124% higher merged-archive QD-Score and 28% higher coverage than a single-node run on Core War, at fixed total LLM calls, and also beats same-model parallel runs. The authors treat this as the first evidence that model diversity itself, rather than parallelism, drives the gain.

They do a clean job of setting up the distributed loop with non-blocking sharing and extending the Digital Red Queen setup to cross-model pressure. Running the heterogeneous versus homogeneous comparison across all four families is the actual new piece; prior LLM-QD work had not done that head-to-head.

The soft spot is exactly the one in the stress-test note. The abstract says the heterogeneous ensemble wins, but supplies no information on whether prompts were word-for-word identical, whether temperature and other settings were matched, or whether any ablation held model identity fixed while varying prompt diversity. Without those, the performance lift could still trace to capability differences or implementation details rather than complementary priors. The numbers also appear without run-to-run variance or statistical tests, and the merged-archive construction is not described.

This is useful for people already working on LLM mutation operators inside quality-diversity loops. It is not yet strong enough to change how most groups would run distributed search, but the empirical comparison is worth checking in detail.

I would send it to peer review so the controls and reporting can be tightened.

Referee Report

3 major / 0 minor

Summary. The paper introduces DEI, a distributed Quality-Diversity (QD) search framework that deploys heterogeneous LLMs as mutation operators across peer nodes using non-blocking collective operations. Extending the Digital Red Queen framework, nodes share local optima at the end of each round. On the Core War domain, a four-node heterogeneous ensemble (GPT-5.4-mini, Claude Sonnet 4.6, GPT-5.2, Claude Haiku 4.5) is reported to achieve 124% higher merged-archive QD-Score (45.90 vs. 20.46) and 28% higher coverage (80.6% vs. 63.0%) than a single-node baseline at fixed total LLM-call budget, and to outperform equally-budgeted homogeneous ensembles across all four model families. The central claim is that model diversity, rather than parallelism alone, supplies complementary behavioral novelty.

Significance. If the performance gains can be shown to arise specifically from complementary inductive biases (rather than uncontrolled differences in capability, prompting, or sharing mechanics), the result would supply the first empirical support for treating LLM heterogeneity as a deliberate source of novelty in distributed evolutionary search. This could affect the design of multi-model QD and evolutionary algorithms more broadly.

major comments (3)

[Abstract] Abstract: the reported 124% QD-Score and 28% coverage improvements are stated without any mention of the number of independent runs, standard deviations, confidence intervals, or statistical tests. Because the central empirical claim rests on these numerical comparisons, the absence of basic reproducibility information prevents verification of the result.
[Abstract] Abstract and §4 (implied experimental section): the manuscript asserts that the heterogeneous ensemble outperforms homogeneous ensembles “across all four model families” and that “model diversity, not merely parallelism, is the key driver.” However, no ablation is described that holds prompt wording, temperature, sharing protocol, and total call budget fixed while varying only model identity. Without such isolating controls, the attribution of gains to distinct inductive biases remains untested and is load-bearing for the paper’s main conclusion.
[Abstract] Abstract: the QD-Score is computed on a “merged archive,” yet the manuscript supplies no description of how solutions from the four nodes are combined, deduplicated, or re-evaluated before the final archive is formed. Because the reported 45.90 vs. 20.46 comparison depends on this construction, the metric is not fully specified.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their constructive comments, which help strengthen the clarity and rigor of our work. We address each major comment point by point below.

read point-by-point responses

Referee: [Abstract] Abstract: the reported 124% QD-Score and 28% coverage improvements are stated without any mention of the number of independent runs, standard deviations, confidence intervals, or statistical tests. Because the central empirical claim rests on these numerical comparisons, the absence of basic reproducibility information prevents verification of the result.

Authors: We agree that the abstract should summarize these details for immediate verifiability. The experiments were performed over 5 independent runs; Section 4 reports standard deviations, confidence intervals, and paired t-tests (p < 0.01). We will revise the abstract to include a concise statistical summary of the key metrics. revision: yes
Referee: [Abstract] Abstract and §4 (implied experimental section): the manuscript asserts that the heterogeneous ensemble outperforms homogeneous ensembles “across all four model families” and that “model diversity, not merely parallelism, is the key driver.” However, no ablation is described that holds prompt wording, temperature, sharing protocol, and total call budget fixed while varying only model identity. Without such isolating controls, the attribution of gains to distinct inductive biases remains untested and is load-bearing for the paper’s main conclusion.

Authors: The homogeneous-ensemble comparisons already hold prompt wording, temperature, sharing protocol, and total call budget fixed while varying model identity across nodes (identical model replicated vs. distinct models). This isolates the contribution of model diversity. We will add explicit text in Section 4 and a clarifying paragraph to emphasize these controls. revision: partial
Referee: [Abstract] Abstract: the QD-Score is computed on a “merged archive,” yet the manuscript supplies no description of how solutions from the four nodes are combined, deduplicated, or re-evaluated before the final archive is formed. Because the reported 45.90 vs. 20.46 comparison depends on this construction, the metric is not fully specified.

Authors: We agree the merge procedure requires explicit description. Solutions from all nodes are collected, deduplicated by behavior descriptor (fitness tie-breaker), and the union forms the merged archive; no re-evaluation occurs because Core War fitness is deterministic. We will insert this description in Section 3 during revision. revision: yes

Circularity Check

0 steps flagged

No circularity: purely empirical evaluation with no derivation chain

full rationale

The manuscript is an empirical study introducing a distributed QD framework and reporting benchmark results on Core War. No equations, first-principles derivations, fitted parameters renamed as predictions, or self-citation chains are present in the provided text or abstract. All performance claims (124% QD-Score lift, 28% coverage lift) are direct experimental measurements at fixed LLM-call budget against single-node and homogeneous baselines. The work therefore contains no load-bearing step that reduces to its own inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that distinct LLMs supply complementary behavioral novelty; no free parameters or invented entities are introduced in the abstract.

axioms (1)

domain assumption Each LLM possesses a distinct creative prior that acts as a complementary source of behavioral novelty.
Stated directly in the abstract as the motivation for heterogeneous assignment.

pith-pipeline@v0.9.1-grok · 5788 in / 1186 out tokens · 27555 ms · 2026-06-29T18:38:56.312437+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

27 extracted references · 25 canonical work pages · 7 internal anchors

[1]

Erick Cantú-Paz.Efficient and Accurate Parallel Genetic Algorithms

URL https://arxiv.org/abs/2310.13032. Erick Cantú-Paz.Efficient and Accurate Parallel Genetic Algorithms. Genetic Algorithms and Evolution- ary Computation. Springer US,

work page arXiv
[2]

doi: 10.1007/978-1-4615-4369-5

ISBN 9781461543695. doi: 10.1007/978-1-4615-4369-5. URL http://dx.doi.org/10.1007/978-1-4615-4369-5. P. A. Castillo, M. G. Arenas, A. M. Mora, J. L. J. Laredo, G. Romero, V. M. Rivas, and J. J. Merelo. Distributed evolutionary computation using REST,

work page doi:10.1007/978-1-4615-4369-5
[3]

Distributed Evolutionary Computation using REST

URLhttps://arxiv.org/abs/1105.4971. Konstantinos Chatzilygeroudis, Antoine Cully, Vassilis Vassiliades, and Jean-Baptiste Mouret. Quality-Diversity optimization: a novel branch of stochastic optimization,

work page internal anchor Pith review Pith/arXiv arXiv
[4]

Angelica Chen, David M

URLhttps://arxiv.org/abs/2012.04322. Angelica Chen, David M. Dohan, and David R. So. EvoPrompting: Language models for code-level neural architecture search,

work page arXiv 2012
[5]

URLhttps://arxiv.org/abs/2302.14838. Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Ponde de Oliveira Pinto, Jared Kaplan, Harri Edwards, Yuri Burda, Nicholas Joseph, Greg Brockman, Alex Ray, Raul Puri, Gretchen Krueger, Michael Petrov, Heidy Khlaaf, Girish Sastry, Pamela Mishkin, Brooke Chan, Scott Gray, Nick Ryder, Mikhail Pavlov, Alethea Po...

work page arXiv
[6]

URLhttps://arxiv.org/abs/2107.03374. F. Corno, E. Sanchez, and G. Squillero. Exploiting co-evolution and a modified island model to climb the Core War hill. InThe 2003 Congress on Evolutionary Computation (CEC ’03), volume 3, pages 2217–2221. IEEE,

work page internal anchor Pith review Pith/arXiv arXiv 2003
[7]

URLhttp://dx.doi.org/10.1109/CEC.2003.1299947

doi: 10.1109/CEC.2003.1299947. URLhttp://dx.doi.org/10.1109/CEC.2003.1299947. Antoine Cully and Yiannis Demiris. Hierarchical behavioral repertoires with unsupervised descriptors. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO ’18), pages 69–76. ACM, July

work page doi:10.1109/cec.2003.1299947 2003
[8]

URLhttp://dx.doi.org/10.1145/3205455.3205571

doi: 10.1145/3205455.3205571. URLhttp://dx.doi.org/10.1145/3205455.3205571. 11 DEI: Diversity in Evolutionary Inference for Quality-Diversity Search Manon Flageat, Bryan Lim, Luca Grillotti, Maxime Allard, Simón C. Smith, and Antoine Cully. Benchmarking quality-diversity algorithms on neuroevolution for reinforcement learning,

work page doi:10.1145/3205455.3205571
[9]

Matthew C

URLhttps://arxiv.org/ abs/2211.02193. Matthew C. Fontaine and Stefanos Nikolaidis. Differentiable quality diversity,

work page arXiv
[10]

Gensyn AI

URLhttps://arxiv.org/ abs/2106.03894. Gensyn AI. AXL: A p2p network for decentralized agentic and AI/ML applications.https://github.com/ gensyn-ai/axl,

work page arXiv
[11]

Zican Hu, Shilin Zhang, Yafu Li, Jianhao Yan, Xuyang Hu, Leyang Cui, Xiaoye Qu, Chunlin Chen, Yu Cheng, and Zhi Wang

doi: 10.1016/0167-2789(90)90076-2. Zican Hu, Shilin Zhang, Yafu Li, Jianhao Yan, Xuyang Hu, Leyang Cui, Xiaoye Qu, Chunlin Chen, Yu Cheng, and Zhi Wang. Diversity-incentivized exploration for versatile reasoning. InProceedings of the 14th International Conference on Learning Representations (ICLR),

work page doi:10.1016/0167-2789(90)90076-2
[12]

Akarsh Kumar, Ryan Bahlous-Boldi, Prafull Sharma, Phillip Isola, Sebastian Risi, Yujin Tang, and David Ha

URLhttps://arxiv.org/abs/2509.26209. Akarsh Kumar, Ryan Bahlous-Boldi, Prafull Sharma, Phillip Isola, Sebastian Risi, Yujin Tang, and David Ha. Digital red queen: Adversarial program evolution in core war with llms,

work page arXiv
[13]

Joel Lehman and Kenneth O

URLhttps://arxiv.org/abs/ 2601.03335. Joel Lehman and Kenneth O. Stanley. Abandoning objectives: Evolution through the search for novelty alone. Evolutionary Computation, 19(2):189–223, June

work page arXiv
[14]

doi: 10.1162/EVCO_a_00025

ISSN 1530-9304. doi: 10.1162/EVCO_a_00025. Joel Lehman, Jonathan Gordon, Shawn Jain, Kamal Ndousse, Cathy Yeh, and Kenneth O. Stanley. Evolution through large models,

work page doi:10.1162/evco_a_00025
[15]

Tianjian Li, Yiming Zhang, Ping Yu, Swarnadeep Saha, Daniel Khashabi, Jason Weston, Jack Lanchantin, and Tianlu Wang

URLhttps://arxiv.org/abs/2206.08896. Tianjian Li, Yiming Zhang, Ping Yu, Swarnadeep Saha, Daniel Khashabi, Jason Weston, Jack Lanchantin, and Tianlu Wang. Jointly reinforcing diversity and quality in language model generations,

work page arXiv
[16]

URLhttps: //arxiv.org/abs/2509.02534. Yujia Li, David Choi, Junyoung Chung, Nate Kushman, Julian Schrittwieser, Rémi Leblond, Tom Eccles, James Keeling, Felix Gimeno, Agustin Dal Lago, Thomas Hubert, Peter Choy, Cyprien de Masson d’Autume, Igor Babuschkin, Xinyun Chen, Po-Sen Huang, Johannes Welbl, Sven Gowal, Alexey Cherepanov, James Molloy, Daniel J. Ma...

work page arXiv
[17]

Competition-Level Code Generation with AlphaCode

URLhttps://arxiv.org/abs/ 2203.07814. Tian Liang, Zhiwei He, Wenxiang Jiao, Xing Wang, Yan Wang, Rui Wang, Yujiu Yang, Shuming Shi, and Zhaopeng Tu. Encouraging divergent thinking in large language models through multi-agent debate. InProceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 17889–17904, Miami, Florida, USA,

work page internal anchor Pith review Pith/arXiv arXiv 2024
[18]

doi: 10.18653/v1/2024.emnlp-main.992

Association for Computational Linguistics. doi: 10.18653/v1/2024.emnlp-main.992. URL https://aclanthology.org/2024.emnlp-main.992/. Jean-Baptiste Mouret and Jeff Clune. Illuminating search spaces by mapping elites,

work page doi:10.18653/v1/2024.emnlp-main.992 2024
[19]

Illuminating search spaces by mapping elites

URLhttps://arxiv. org/abs/1504.04909. Alexander Novikov, Ngân V˜u, Marvin Eisenberger, Emilien Dupont, Po-Sen Huang, Adam Zsolt Wagner, Sergey Shirobokov, Borislav Kozlovskii, Francisco J. R. Ruiz, Abbas Mehrabian, M. Pawan Kumar, Abigail See, Swarat Chaudhuri, George Holland, Alex Davies, Sebastian Nowozin, Pushmeet Kohli, and Matej Balog. Alphaevolve: A...

work page internal anchor Pith review Pith/arXiv arXiv
[20]

AlphaEvolve: A coding agent for scientific and algorithmic discovery

URLhttps://arxiv.org/abs/2506.13131. Bernardino Romera-Paredes, Mohammadamin Barekatain, Alexander Novikov, Matej Balog, M. Pawan Kumar, Emilien Dupont, Francisco J. R. Ruiz, Jordan S. Ellenberg, Pengming Wang, Omar Fawzi, Pushmeet Kohli, and Alhussein Fawzi. Mathematical discoveries from program search with large language models.Nature, 625(7995):468–475...

work page internal anchor Pith review Pith/arXiv arXiv
[21]

43 Nature625(7995), 468–475 (2024) https://doi.org/10.1038/s41586-023-06924-6

ISSN 1476-4687. doi: 10.1038/s41586-023-06924-6. URLhttp: //dx.doi.org/10.1038/s41586-023-06924-6. Christopher D. Rosin and Richard K. Belew. New methods for competitive coevolution.Evolutionary Computation, 5(1):1–29,

work page doi:10.1038/s41586-023-06924-6
[22]

12 DEI: Diversity in Evolutionary Inference for Quality-Diversity Search Leigh Van Valen

doi: 10.1162/evco.1997.5.1.1. 12 DEI: Diversity in Evolutionary Inference for Quality-Diversity Search Leigh Van Valen. A new evolutionary law.Evolutionary Theory, 1:1–30,

work page doi:10.1162/evco.1997.5.1.1 1997
[23]

Using Centroidal Voronoi Tessellations to Scale Up the Multi-dimensional Archive of Phenotypic Elites Algorithm

URL https: //arxiv.org/abs/1610.05729. Dimitris Vyzovitis, Yusef Napora, Dirk McCormick, David Dias, and Yiannis Psaras. Gossipsub: Attack-resilient message propagation in the filecoin and ETH2.0 networks,

work page internal anchor Pith review Pith/arXiv arXiv
[24]

URLhttps://arxiv.org/abs/2007. 02754. Xingyu Wu, Sheng hao Wu, Jibin Wu, Liang Feng, and Kay Chen Tan. Evolutionary computation in the era of large language model: Survey and roadmap,

2007
[25]

Chengrun Yang, Xuezhi Wang, Yifeng Lu, Hanxiao Liu, Quoc V

URLhttps://arxiv.org/abs/2401.10034. Chengrun Yang, Xuezhi Wang, Yifeng Lu, Hanxiao Liu, Quoc V. Le, Denny Zhou, and Xinyun Chen. Large language models as optimizers,

work page arXiv
[26]

URLhttps://arxiv.org/abs/2309.03409. A. MARS Configuration Details All simulations use the following MARS configuration, held constant across all experimental conditions: •Core size: 8,000 instructions •Maximum cycles per battle: 80,000 •Rounds per pair: 20 •Initial warrior placement: random, minimum separation enforced •Process limit per warrior: unlimit...

work page internal anchor Pith review Pith/arXiv arXiv
[27]

This allows nodes behind firewalls or consumer routers to participate without manual port forwarding

assigns each node a stable IPv6 address derived from its public key and performs NAT traversal via a distributed spanning-tree routing scheme. This allows nodes behind firewalls or consumer routers to participate without manual port forwarding. C.2. AXL: Application Interface to the Network Layer The bridge between the DRQ application and the Yggdrasil tr...

2026

[1] [1]

Erick Cantú-Paz.Efficient and Accurate Parallel Genetic Algorithms

URL https://arxiv.org/abs/2310.13032. Erick Cantú-Paz.Efficient and Accurate Parallel Genetic Algorithms. Genetic Algorithms and Evolution- ary Computation. Springer US,

work page arXiv

[2] [2]

doi: 10.1007/978-1-4615-4369-5

ISBN 9781461543695. doi: 10.1007/978-1-4615-4369-5. URL http://dx.doi.org/10.1007/978-1-4615-4369-5. P. A. Castillo, M. G. Arenas, A. M. Mora, J. L. J. Laredo, G. Romero, V. M. Rivas, and J. J. Merelo. Distributed evolutionary computation using REST,

work page doi:10.1007/978-1-4615-4369-5

[3] [3]

Distributed Evolutionary Computation using REST

URLhttps://arxiv.org/abs/1105.4971. Konstantinos Chatzilygeroudis, Antoine Cully, Vassilis Vassiliades, and Jean-Baptiste Mouret. Quality-Diversity optimization: a novel branch of stochastic optimization,

work page internal anchor Pith review Pith/arXiv arXiv

[4] [4]

Angelica Chen, David M

URLhttps://arxiv.org/abs/2012.04322. Angelica Chen, David M. Dohan, and David R. So. EvoPrompting: Language models for code-level neural architecture search,

work page arXiv 2012

[5] [5]

URLhttps://arxiv.org/abs/2302.14838. Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Ponde de Oliveira Pinto, Jared Kaplan, Harri Edwards, Yuri Burda, Nicholas Joseph, Greg Brockman, Alex Ray, Raul Puri, Gretchen Krueger, Michael Petrov, Heidy Khlaaf, Girish Sastry, Pamela Mishkin, Brooke Chan, Scott Gray, Nick Ryder, Mikhail Pavlov, Alethea Po...

work page arXiv

[6] [6]

URLhttps://arxiv.org/abs/2107.03374. F. Corno, E. Sanchez, and G. Squillero. Exploiting co-evolution and a modified island model to climb the Core War hill. InThe 2003 Congress on Evolutionary Computation (CEC ’03), volume 3, pages 2217–2221. IEEE,

work page internal anchor Pith review Pith/arXiv arXiv 2003

[7] [7]

URLhttp://dx.doi.org/10.1109/CEC.2003.1299947

doi: 10.1109/CEC.2003.1299947. URLhttp://dx.doi.org/10.1109/CEC.2003.1299947. Antoine Cully and Yiannis Demiris. Hierarchical behavioral repertoires with unsupervised descriptors. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO ’18), pages 69–76. ACM, July

work page doi:10.1109/cec.2003.1299947 2003

[8] [8]

URLhttp://dx.doi.org/10.1145/3205455.3205571

doi: 10.1145/3205455.3205571. URLhttp://dx.doi.org/10.1145/3205455.3205571. 11 DEI: Diversity in Evolutionary Inference for Quality-Diversity Search Manon Flageat, Bryan Lim, Luca Grillotti, Maxime Allard, Simón C. Smith, and Antoine Cully. Benchmarking quality-diversity algorithms on neuroevolution for reinforcement learning,

work page doi:10.1145/3205455.3205571

[9] [9]

Matthew C

URLhttps://arxiv.org/ abs/2211.02193. Matthew C. Fontaine and Stefanos Nikolaidis. Differentiable quality diversity,

work page arXiv

[10] [10]

Gensyn AI

URLhttps://arxiv.org/ abs/2106.03894. Gensyn AI. AXL: A p2p network for decentralized agentic and AI/ML applications.https://github.com/ gensyn-ai/axl,

work page arXiv

[11] [11]

Zican Hu, Shilin Zhang, Yafu Li, Jianhao Yan, Xuyang Hu, Leyang Cui, Xiaoye Qu, Chunlin Chen, Yu Cheng, and Zhi Wang

doi: 10.1016/0167-2789(90)90076-2. Zican Hu, Shilin Zhang, Yafu Li, Jianhao Yan, Xuyang Hu, Leyang Cui, Xiaoye Qu, Chunlin Chen, Yu Cheng, and Zhi Wang. Diversity-incentivized exploration for versatile reasoning. InProceedings of the 14th International Conference on Learning Representations (ICLR),

work page doi:10.1016/0167-2789(90)90076-2

[12] [12]

Akarsh Kumar, Ryan Bahlous-Boldi, Prafull Sharma, Phillip Isola, Sebastian Risi, Yujin Tang, and David Ha

URLhttps://arxiv.org/abs/2509.26209. Akarsh Kumar, Ryan Bahlous-Boldi, Prafull Sharma, Phillip Isola, Sebastian Risi, Yujin Tang, and David Ha. Digital red queen: Adversarial program evolution in core war with llms,

work page arXiv

[13] [13]

Joel Lehman and Kenneth O

URLhttps://arxiv.org/abs/ 2601.03335. Joel Lehman and Kenneth O. Stanley. Abandoning objectives: Evolution through the search for novelty alone. Evolutionary Computation, 19(2):189–223, June

work page arXiv

[14] [14]

doi: 10.1162/EVCO_a_00025

ISSN 1530-9304. doi: 10.1162/EVCO_a_00025. Joel Lehman, Jonathan Gordon, Shawn Jain, Kamal Ndousse, Cathy Yeh, and Kenneth O. Stanley. Evolution through large models,

work page doi:10.1162/evco_a_00025

[15] [15]

Tianjian Li, Yiming Zhang, Ping Yu, Swarnadeep Saha, Daniel Khashabi, Jason Weston, Jack Lanchantin, and Tianlu Wang

URLhttps://arxiv.org/abs/2206.08896. Tianjian Li, Yiming Zhang, Ping Yu, Swarnadeep Saha, Daniel Khashabi, Jason Weston, Jack Lanchantin, and Tianlu Wang. Jointly reinforcing diversity and quality in language model generations,

work page arXiv

[16] [16]

URLhttps: //arxiv.org/abs/2509.02534. Yujia Li, David Choi, Junyoung Chung, Nate Kushman, Julian Schrittwieser, Rémi Leblond, Tom Eccles, James Keeling, Felix Gimeno, Agustin Dal Lago, Thomas Hubert, Peter Choy, Cyprien de Masson d’Autume, Igor Babuschkin, Xinyun Chen, Po-Sen Huang, Johannes Welbl, Sven Gowal, Alexey Cherepanov, James Molloy, Daniel J. Ma...

work page arXiv

[17] [17]

Competition-Level Code Generation with AlphaCode

URLhttps://arxiv.org/abs/ 2203.07814. Tian Liang, Zhiwei He, Wenxiang Jiao, Xing Wang, Yan Wang, Rui Wang, Yujiu Yang, Shuming Shi, and Zhaopeng Tu. Encouraging divergent thinking in large language models through multi-agent debate. InProceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 17889–17904, Miami, Florida, USA,

work page internal anchor Pith review Pith/arXiv arXiv 2024

[18] [18]

doi: 10.18653/v1/2024.emnlp-main.992

Association for Computational Linguistics. doi: 10.18653/v1/2024.emnlp-main.992. URL https://aclanthology.org/2024.emnlp-main.992/. Jean-Baptiste Mouret and Jeff Clune. Illuminating search spaces by mapping elites,

work page doi:10.18653/v1/2024.emnlp-main.992 2024

[19] [19]

Illuminating search spaces by mapping elites

URLhttps://arxiv. org/abs/1504.04909. Alexander Novikov, Ngân V˜u, Marvin Eisenberger, Emilien Dupont, Po-Sen Huang, Adam Zsolt Wagner, Sergey Shirobokov, Borislav Kozlovskii, Francisco J. R. Ruiz, Abbas Mehrabian, M. Pawan Kumar, Abigail See, Swarat Chaudhuri, George Holland, Alex Davies, Sebastian Nowozin, Pushmeet Kohli, and Matej Balog. Alphaevolve: A...

work page internal anchor Pith review Pith/arXiv arXiv

[20] [20]

AlphaEvolve: A coding agent for scientific and algorithmic discovery

URLhttps://arxiv.org/abs/2506.13131. Bernardino Romera-Paredes, Mohammadamin Barekatain, Alexander Novikov, Matej Balog, M. Pawan Kumar, Emilien Dupont, Francisco J. R. Ruiz, Jordan S. Ellenberg, Pengming Wang, Omar Fawzi, Pushmeet Kohli, and Alhussein Fawzi. Mathematical discoveries from program search with large language models.Nature, 625(7995):468–475...

work page internal anchor Pith review Pith/arXiv arXiv

[21] [21]

43 Nature625(7995), 468–475 (2024) https://doi.org/10.1038/s41586-023-06924-6

ISSN 1476-4687. doi: 10.1038/s41586-023-06924-6. URLhttp: //dx.doi.org/10.1038/s41586-023-06924-6. Christopher D. Rosin and Richard K. Belew. New methods for competitive coevolution.Evolutionary Computation, 5(1):1–29,

work page doi:10.1038/s41586-023-06924-6

[22] [22]

12 DEI: Diversity in Evolutionary Inference for Quality-Diversity Search Leigh Van Valen

doi: 10.1162/evco.1997.5.1.1. 12 DEI: Diversity in Evolutionary Inference for Quality-Diversity Search Leigh Van Valen. A new evolutionary law.Evolutionary Theory, 1:1–30,

work page doi:10.1162/evco.1997.5.1.1 1997

[23] [23]

Using Centroidal Voronoi Tessellations to Scale Up the Multi-dimensional Archive of Phenotypic Elites Algorithm

URL https: //arxiv.org/abs/1610.05729. Dimitris Vyzovitis, Yusef Napora, Dirk McCormick, David Dias, and Yiannis Psaras. Gossipsub: Attack-resilient message propagation in the filecoin and ETH2.0 networks,

work page internal anchor Pith review Pith/arXiv arXiv

[24] [24]

URLhttps://arxiv.org/abs/2007. 02754. Xingyu Wu, Sheng hao Wu, Jibin Wu, Liang Feng, and Kay Chen Tan. Evolutionary computation in the era of large language model: Survey and roadmap,

2007

[25] [25]

Chengrun Yang, Xuezhi Wang, Yifeng Lu, Hanxiao Liu, Quoc V

URLhttps://arxiv.org/abs/2401.10034. Chengrun Yang, Xuezhi Wang, Yifeng Lu, Hanxiao Liu, Quoc V. Le, Denny Zhou, and Xinyun Chen. Large language models as optimizers,

work page arXiv

[26] [26]

URLhttps://arxiv.org/abs/2309.03409. A. MARS Configuration Details All simulations use the following MARS configuration, held constant across all experimental conditions: •Core size: 8,000 instructions •Maximum cycles per battle: 80,000 •Rounds per pair: 20 •Initial warrior placement: random, minimum separation enforced •Process limit per warrior: unlimit...

work page internal anchor Pith review Pith/arXiv arXiv

[27] [27]

This allows nodes behind firewalls or consumer routers to participate without manual port forwarding

assigns each node a stable IPv6 address derived from its public key and performs NAT traversal via a distributed spanning-tree routing scheme. This allows nodes behind firewalls or consumer routers to participate without manual port forwarding. C.2. AXL: Application Interface to the Network Layer The bridge between the DRQ application and the Yggdrasil tr...

2026