pith. sign in

arxiv: 2505.06617 · v5 · submitted 2025-05-10 · 💻 cs.NE

Adversarial Coevolutionary Illumination with Generational Adversarial MAP-Elites

Pith reviewed 2026-05-22 16:32 UTC · model grok-4.3

classification 💻 cs.NE
keywords quality-diversitycoevolutionadversarialMAP-Elitesvision embeddingevolutionary algorithmsmulti-agent
0
0 comments X

The pith

Generational Adversarial MAP-Elites alternates which side evolves each generation to illuminate solutions in adversarial problems using video embeddings.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces Generational Adversarial MAP-Elites, or GAME, a coevolutionary quality-diversity method that evolves both sides of a competitive interaction rather than holding one side fixed. It switches which population receives selection and variation at every generation and replaces hand-designed behavior descriptors with a vision embedding model that reads directly from video. Across a multi-agent battle game, a soft-robot wrestling task, and a deck-building game, this produces higher-performing solutions than one-sided baselines while exposing arms-race dynamics, the value of generational extinction for novelty, and the role of neutral mutations as stepping stones. The work matters for any setting where optimization targets move because the opponent also adapts.

Core claim

GAME is a coevolutionary QD algorithm that evolves both adversaries by alternating generational updates and employs a vision embedding model to map raw video into a behavior space for the MAP-Elites archive, removing the requirement for domain-specific descriptors.

What carries the argument

Generational alternation of evolutionary updates between opposing populations, paired with a vision embedding model that supplies behavior coordinates from video input for the quality-diversity archive.

If this is right

  • Alternating generations produces observable arms-race dynamics between the two evolving sides.
  • Periodic extinction of one side increases novelty in the surviving population.
  • Neutral mutations are retained and later become useful for reaching higher performance levels.
  • All algorithmic components are necessary; removing any one degrades the results.
  • The same method works across game, robotics, and card-game domains without custom behavior engineering.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same alternation-plus-embedding pattern could be tested in domains where search spaces permit greater open-ended novelty than the ones used here.
  • Vision-based behavior spaces may lower the barrier to applying quality-diversity methods to new competitive settings that lack obvious geometric descriptors.
  • Similar coevolutionary illumination could be applied to problems such as automated strategy discovery in security or multi-player economic games.

Load-bearing premise

The vision embedding model supplies a behavior space that remains meaningful and generalizable across adversarial domains without requiring any domain-specific adjustments.

What would settle it

Finding that GAME fails to outperform one-sided QD baselines on any of the three domains, or that the vision embeddings collapse distinct behaviors into indistinguishable points, would falsify the central performance and generality claims.

Figures

Figures reproduced from arXiv: 2505.06617 by Alain Jaquier, Florian Turati, Franck Legendre, Meriem Elhosni, Noah Syrkis, Sebastian Risi, Timoth\'ee Anne.

Figure 1
Figure 1. Figure 1: GAME’s core idea is the iterative illumination of an adversarial problem using MTMB-ME [21], switching sides at each generation to promote arms race dynamics. Here, each letter corresponds to a different solution. Evolving both sides follows the artificial life goal of creating an open-ended evolutionary process through adversarial coevo￾lution [15], [16]. For example, the POET algorithm [17], [18] coevolv… view at source ↗
Figure 2
Figure 2. Figure 2: GAME is an adversarial coevolutionary QD algorithm that alternates the illumination of an adversarial problem using MTMB-ME [21], switching sides at each generation to promote arms race dynamics. A key feature is the ability to use a VEM as a domain-agnostic behavior space. These methods only optimize one side of the adversarial problem, significantly limiting illumination. Our second and third case studie… view at source ↗
Figure 3
Figure 3. Figure 3: GAME’s illumination of a multi-agent adversarial game. The point cloud is a 2D PCA projection (22% and 10% explained variance) of the intergenerational tournament between elites found for one run of GAME across 20 generations with 100 000 evaluations per generation. We display timed snapshots of eight duels exhibiting different behaviors. We also indicate the fitness of both sides, represented as the perce… view at source ↗
Figure 4
Figure 4. Figure 4: Variants and ablations comparisons. The solid line is the median, and the shaded area the min and max of 3 replications over 20 generations. (a-b) Larger BTs do not necessarily lead to higher complexity, as GAME-SO has the 2nd-highest behavior complexity but the 4th-highest BT size. (c–d) GAME requires the VEM to reach the highest diversity and QD-Score. (e) Removing bootstrapping leads to a constant disco… view at source ↗
Figure 6
Figure 6. Figure 6: Tournament ELO score between each replication’s 10 best solutions. GAME-SO is significantly better than all variants but Quality-only (p-value < 0.001 with a Mann-Whitney U test and Bonferroni correction). d) Open-endedness: Coverage measures the volume of novelty discovered by GAME with the VEM. One limitation is that this behavior is dependent on two solutions. Another limitation is that visual behavior … view at source ↗
Figure 5
Figure 5. Figure 5: PCA projections of the intergenerational tournaments’ behaviors for each replication of each variant. GAME variants with a VEM show the most homogeneous coverage with less variance between replications. c) Quality: One limitation of computing quality directly from fitness is that it can be high because the opposing side is weak, not because the winning side is strong. To measure a less ambiguous quality of… view at source ↗
Figure 7
Figure 7. Figure 7: Parabellum global trends: Percentage of elites using (a) the Attack atomic with different targets and (b) the Go_to atomic for different thresholds through the generations. The solid line represents the median, and the shaded area between the min and max of GAME-MO’s three replications. 0.963 (p-value < 0.001) for the 998 Red elites and 0.961 (p￾value < 0.001) for the 999 Blue elites. This high correlation… view at source ↗
Figure 8
Figure 8. Figure 8: Arms race example: Percentage of Red elites using the Attack atomic on the weakest enemy archer and Blue elites using the Heal atomic on random ally archers through the generations for GAME-SO’s three replications. For replication 2, as more Red elites targeted the Blue archers, Blue elites evolved to heal their archers. i) An example of arms race: We found an arms race in one of GAME-SO’s replications ( … view at source ↗
Figure 9
Figure 9. Figure 9: Wrestling illumination with sampled snapshots. The central point cloud is the 2D PCA of the visual embedding of one illumination from GAME of the Wrestling adversarial problem (capturing 13.4% and 8.4% of the variance). The outer edge snapshots show examples of frames captured at the end of a diverse set of evaluations. The PCA captures slow-moving robots on the right and faster-moving robots on the left, … view at source ↗
Figure 10
Figure 10. Figure 10: Wrestling visual diversity through generations. GAME leads to higher coverage than GAME-CVT and Random, but lower than MTMB￾ME. The solid line is the median, and the shaded area shows the range from the minimum to the maximum across three replications. GAME GAME-CVT MTMB-ME Random −10000 0 10000 20000 ELO score [PITH_FULL_IMAGE:figures/full_fig_p010_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Wrestling best morphologies’ quality. ELO score from a tourna￾ment between the 10 best morphologies of each side of each variant’s replica￾tions. GAME and GAME-CVT finds better morphologies than MTMB-ME and Random (p-value < 0.001 from a Mann–Whitney U test). than MTMB-ME and Random with a p-value < 0.001 from a Mann–Whitney U test ( [PITH_FULL_IMAGE:figures/full_fig_p010_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Wrestling 2D PCA of the visual embedding with velocity. A single PCA is used for all replications of all variants. GAME and GAME￾CVT discover morphologies with higher velocities, which are present more on the left side, suggesting that the VEM embedding PCA captures velocity [PITH_FULL_IMAGE:figures/full_fig_p011_12.png] view at source ↗
Figure 14
Figure 14. Figure 14: Morphological species through generations. (a–b) Morphology clustering into species from all generations using k-modes and UMAP. The numbers indicate the selected elite for each cluster in each generation. (c–d) Average ELO score of each species through generations. GAME discovers all species in the first generation and then improves their average fitness. problem implementation does not appear open-ended… view at source ↗
Figure 15
Figure 15. Figure 15: Robustness to the number of frames used by the VEM. The VEM behaviors are consistent when at least 5 frames are used and significantly differ from random selection. The solid line is the median, and the shaded area shows the range from the first and third quantiles across 20 replications. 11 [PITH_FULL_IMAGE:figures/full_fig_p011_15.png] view at source ↗
Figure 16
Figure 16. Figure 16: Hearthbreaker intergenerational tournament. ELO score of each generation’s elites from GAME on the five pairs of classes in Hearthbreaker in an intergenerational tournament. GAME leads to an overall improvement in quality at each generation for both sides and all five pairs of classes. GAME ME −10000 0 10000 ELO Score *** (a) Warrior GAME ME −10000 0 10000 ELO Score ns Warlock GAME ME −10000 0 *** (b) Rog… view at source ↗
Figure 17
Figure 17. Figure 17: Hearthbreaker quality comparison between GAME and ME. GAME leads to better top-50 elites than ME for 7 out of 10 comparisons and is not significantly better for the other 3. (Significance for Mann–Whitney U test: ∗ = p < 0.05, ∗∗ = p < 0.01, ∗∗∗ = p < 0.001, and ns = not significant.) 2 4 6 1 2 3 4 5 GAME Mana cost variation (a) Warrior 2 4 6 Mana cost mean 1 2 3 4 5 ME Mana cost variation 2 4 6 1 2 3 4 5… view at source ↗
Figure 18
Figure 18. Figure 18: Hearthbreaker archive comparison between GAME and ME against the starting decks. The color scale corresponds to the average fitness of the evolved decks against the opponent starting deck for 50 duels. ME yields denser, larger coverage by focusing on a single task, whereas GAME illuminates 50 tasks simultaneously. Still, GAME finds solutions with similar quality to ME, even though it was not searching for… view at source ↗
read the original abstract

Quality-Diversity (QD) algorithms seek to discover diverse, high-performing solutions across a behavior space, in contrast to conventional optimization methods that target a single optimum. Adversarial problems present unique challenges for QD approaches, as the competing nature of opposing sides creates interdependencies that complicate the evolution process. Existing QD methods applied to such scenarios typically fix one side, constraining the open-endedness. We present Generational Adversarial MAP-Elites (GAME), a coevolutionary QD algorithm that evolves both sides by alternating which side is evolved at each generation. By integrating a vision embedding model (VEM), our approach eliminates the need for domain-specific behavior descriptors and instead operates on video. We validate GAME across three distinct adversarial domains: a multi-agent battle game, a soft-robot wrestling environment, and a deck building game. We validate that all its components are necessary, that the VEM is effective in two different domains, and that GAME finds better solutions than one-sided QD baselines. Our experiments reveal several evolutionary phenomena, including arms race-like dynamics, enhanced novelty through generational extinction, and the preservation of neutral mutations as crucial stepping stones toward the highest performance. While GAME successfully illuminates all three adversarial problems, its capacity for truly open-ended discovery remains constrained by the nature of the search spaces used in this paper. These findings show GAME's broad applicability and highlight opportunities for future research into open-ended adversarial coevolution. Code and videos are available at: https://github.com/Timothee-ANNE/GAME

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces Generational Adversarial MAP-Elites (GAME), a coevolutionary quality-diversity algorithm that alternates generations between evolving two opposing populations in adversarial domains. It integrates a pre-trained vision embedding model (VEM) to map raw video observations to a behavior space, thereby avoiding hand-crafted domain-specific descriptors. The approach is evaluated on three adversarial domains—a multi-agent battle game, a soft-robot wrestling environment, and a deck-building game—with claims that all components are necessary, the VEM is effective in two domains, GAME outperforms one-sided QD baselines, and the method reveals evolutionary phenomena including arms-race dynamics, enhanced novelty via generational extinction, and the role of neutral mutations as stepping stones.

Significance. If the empirical claims hold under rigorous scrutiny, this work advances quality-diversity optimization into coevolutionary adversarial settings, where interdependencies between sides have previously limited open-ended illumination. The VEM integration offers a path toward more generalizable behavior descriptors in video-based domains, and the reported phenomena could inform models of open-ended evolution. The paper itself notes that truly open-ended discovery remains constrained by the chosen search spaces, which tempers the broader implications.

major comments (2)
  1. [Experiments / VEM integration section] The central claim that the VEM supplies a meaningful, generalizable behavior space for adversarial illumination (eliminating domain-specific descriptors) lacks a direct quantitative validation. No correlation analysis is presented between VEM distances and task-relevant features such as win rates, strategic metrics, or interaction outcomes; nor is there an ablation replacing VEM with random projections or low-level visual statistics to test whether the archive reflects functional rather than superficial diversity. This is load-bearing for the reported arms-race dynamics and component necessity, as misalignment here would mean the illumination metric does not track the adversarial objective.
  2. [Results / Ablation studies] The assertion that all components are necessary and that GAME finds better solutions than one-sided QD baselines is supported by validation statements, but the results lack detailed ablation tables with quantitative metrics, error bars, and statistical tests. For instance, performance drops when removing generational alternation or the adversarial coevolution loop are not quantified with effect sizes or significance levels across the three domains.
minor comments (2)
  1. [Abstract] The abstract summarizes success and component necessity but omits any specific quantitative results, error bars, or key performance deltas; including one or two headline metrics would improve clarity for readers.
  2. [Figures] Figure captions and axis labels in the experimental results could be expanded to explicitly state what is being compared (e.g., archive coverage vs. performance) and whether error bars represent standard deviation or standard error.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive and detailed feedback, which helps us strengthen the presentation of our results. We address each major comment below and will incorporate the suggested improvements in a revised manuscript.

read point-by-point responses
  1. Referee: [Experiments / VEM integration section] The central claim that the VEM supplies a meaningful, generalizable behavior space for adversarial illumination (eliminating domain-specific descriptors) lacks a direct quantitative validation. No correlation analysis is presented between VEM distances and task-relevant features such as win rates, strategic metrics, or interaction outcomes; nor is there an ablation replacing VEM with random projections or low-level visual statistics to test whether the archive reflects functional rather than superficial diversity. This is load-bearing for the reported arms-race dynamics and component necessity, as misalignment here would mean the illumination metric does not track the adversarial objective.

    Authors: We agree that a more direct quantitative validation of the VEM would strengthen the manuscript. While the current experiments show that GAME with the VEM produces superior performance and interpretable dynamics in two domains (suggesting the embedding captures task-relevant variation), we did not include explicit correlation analyses or ablations against random or low-level baselines. In the revision we will add: (1) correlation coefficients between VEM distances and domain-specific metrics such as win rates and strategic indicators, and (2) ablation experiments replacing the VEM with random projections and basic visual statistics, reporting the resulting archive quality and evolutionary dynamics. These additions will directly test whether the behavior space reflects functional rather than superficial diversity. revision: yes

  2. Referee: [Results / Ablation studies] The assertion that all components are necessary and that GAME finds better solutions than one-sided QD baselines is supported by validation statements, but the results lack detailed ablation tables with quantitative metrics, error bars, and statistical tests. For instance, performance drops when removing generational alternation or the adversarial coevolution loop are not quantified with effect sizes or significance levels across the three domains.

    Authors: We acknowledge that the ablation results would benefit from more granular quantitative reporting. The manuscript already demonstrates performance differences when components are ablated, supporting the necessity claims, yet these are presented without full tables, error bars, or statistical tests. In the revised version we will expand the results section with comprehensive ablation tables for all three domains. Each table will report mean performance and standard deviation across independent runs, include error bars on the corresponding figures, and provide statistical significance tests (e.g., paired t-tests or Wilcoxon rank-sum tests) together with effect sizes to quantify the impact of removing generational alternation or the coevolutionary loop. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected in derivation or validation chain

full rationale

The paper describes an algorithmic extension of MAP-Elites for coevolutionary adversarial settings, with the core procedure (alternating generational evolution of opposing sides) and integration of an external VEM for behavior descriptors presented as a direct construction rather than a derived prediction. Validation proceeds via explicit comparisons to one-sided QD baselines across three domains, component ablations, and reported evolutionary phenomena, all of which are externally falsifiable against the stated experimental setups and do not reduce to fitted parameters or self-referential definitions. No load-bearing step equates an output to its input by construction, and the VEM is treated as an off-the-shelf input whose effectiveness is checked empirically rather than assumed via prior self-citation.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The paper builds upon the established MAP-Elites algorithm and coevolutionary concepts but introduces the generational alternation and VEM usage as new elements. No new physical entities or ad-hoc parameters are mentioned in the abstract.

axioms (1)
  • domain assumption Evolutionary algorithms can discover diverse high-performing solutions through selection and variation operators.
    Core to all QD methods including MAP-Elites.

pith-pipeline@v0.9.0 · 5821 in / 1414 out tokens · 119980 ms · 2026-05-22T16:32:15.416911+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

63 extracted references · 63 canonical work pages · 3 internal anchors

  1. [1]

    Quality diversity: A new frontier for evolutionary computation,

    J. K. Pugh, L. B. Soros, and K. O. Stanley, “Quality diversity: A new frontier for evolutionary computation,”Frontiers in Robotics and AI, vol. 3, p. 40, 2016

  2. [2]

    Robots that can adapt like animals,

    A. Cully, J. Clune, D. Tarapore, and J.-B. Mouret, “Robots that can adapt like animals,”Nature, vol. 521, no. 7553, pp. 503–507, 2015

  3. [3]

    Procedural content generation through quality diversity,

    D. Gravina, A. Khalifa, A. Liapis, J. Togelius, and G. N. Yannakakis, “Procedural content generation through quality diversity,” in2019 IEEE Conference on Games (CoG). IEEE, 2019, pp. 1–8

  4. [4]

    An artificial intelligence enabled chemical synthesis robot for explo- ration and optimization of nanomaterials,

    Y . Jiang, D. Salley, A. Sharma, G. Keenan, M. Mullin, and L. Cronin, “An artificial intelligence enabled chemical synthesis robot for explo- ration and optimization of nanomaterials,”Science advances, vol. 8, no. 40, p. eabo2626, 2022

  5. [5]

    Bayesian quality-diversity approaches for constrained optimization problems with mixed continuous, discrete and categorical variables,

    L. Brevault and M. Balesdent, “Bayesian quality-diversity approaches for constrained optimization problems with mixed continuous, discrete and categorical variables,”Engineering Applications of Artificial Intel- ligence, vol. 133, p. 108118, 2024

  6. [6]

    T. C. Schelling,The Strategy of Conflict: with a new Preface by the Author. Harvard university press, 1980

  7. [7]

    Baldwin, M

    R. Baldwin, M. Cave, and M. Lodge,Understanding regulation: theory, strategy, and practice. Oxford university press, 2011

  8. [8]

    Adversarial Attacks and Defences: A Survey

    A. Chakraborty, M. Alam, V . Dey, A. Chattopadhyay, and D. Mukhopad- hyay, “Adversarial attacks and defences: A survey,”arXiv preprint arXiv:1810.00069, 2018

  9. [9]

    Arms race in adversarial malware detection: A survey,

    D. Li, Q. Li, Y . Ye, and S. Xu, “Arms race in adversarial malware detection: A survey,”ACM Computing Surveys (CSUR), vol. 55, no. 1, pp. 1–35, 2021

  10. [10]

    Hearthstone: Heroes of warcraft,

    Blizzard Entertainment, “Hearthstone: Heroes of warcraft,” 2014, digital collectible card game. [Online]. Available: https://playhearthstone.com/

  11. [11]

    Mapping hearthstone deck spaces through map- elites with sliding boundaries,

    M. C. Fontaine, S. Lee, L. B. Soros, F. de Mesentier Silva, J. Togelius, and A. K. Hoover, “Mapping hearthstone deck spaces through map- elites with sliding boundaries,” inProceedings of The Genetic and Evolutionary Computation Conference, 2019, pp. 161–169

  12. [12]

    Covari- ance matrix adaptation for the rapid illumination of behavior space,

    M. C. Fontaine, J. Togelius, S. Nikolaidis, and A. K. Hoover, “Covari- ance matrix adaptation for the rapid illumination of behavior space,” inProceedings of the 2020 genetic and evolutionary computation conference, 2020, pp. 94–102

  13. [13]

    Rainbow teaming: Open-ended generation of diverse adversarial prompts,

    M. Samvelyan, S. Raparthy, A. Lupu, E. Hambro, A. H. Markosyan, M. Bhatt, Y . Mao, M. Jiang, J. Parker-Holder, J. Foerster, T. Rocktaschel, and R. Raileanu, “Rainbow teaming: Open-ended generation of diverse adversarial prompts,”ArXiv, vol. abs/2402.16822, 2024. [Online]. Available: https://api.semanticscholar.org/CorpusId:268031888 14 Preprint. IEEE Tran...

  14. [14]

    Arms races between and within species,

    R. Dawkins and J. R. Krebs, “Arms races between and within species,” Proceedings of the Royal Society of London. Series B. Biological Sciences, vol. 205, no. 1161, pp. 489–511, 1979

  15. [15]

    Open problems in artificial life,

    M. A. Bedau, J. S. McCaskill, N. H. Packard, S. Rasmussen, C. Adami, D. G. Green, T. Ikegami, K. Kaneko, and T. S. Ray, “Open problems in artificial life,”Artificial life, vol. 6, no. 4, pp. 363–376, 2000

  16. [16]

    What is artificial life today, and where should it go?

    A. Dorin and S. Stepney, “What is artificial life today, and where should it go?” pp. 1–15, 2024

  17. [17]

    Poet: open- ended coevolution of environments and their optimized solutions,

    R. Wang, J. Lehman, J. Clune, and K. O. Stanley, “Poet: open- ended coevolution of environments and their optimized solutions,” in Proceedings of the Genetic and Evolutionary Computation Conference, 2019, pp. 142–151

  18. [18]

    Enhanced poet: Open-ended reinforcement learning through unbounded invention of learning challenges and their solutions,

    R. Wang, J. Lehman, A. Rawal, J. Zhi, Y . Li, J. Clune, and K. Stanley, “Enhanced poet: Open-ended reinforcement learning through unbounded invention of learning challenges and their solutions,” inInternational conference on machine learning. PMLR, 2020, pp. 9940–9951

  19. [19]

    Exploring the evolution of gans through quality diversity,

    V . Costa, N. Lourenc ¸o, J. Correia, and P. Machado, “Exploring the evolution of gans through quality diversity,” inProceedings of the 2020 genetic and evolutionary computation conference, 2020, pp. 297–305

  20. [20]

    Quality-diversity self-play: Open-ended strategy innovation via foundation models,

    A. Dharna, C. Lu, and J. Clune, “Quality-diversity self-play: Open-ended strategy innovation via foundation models,” inNeurIPS 2024 Workshop on Open-World Agents, 2024

  21. [21]

    Multi-task multi-behavior map-elites,

    T. Anne and J.-B. Mouret, “Multi-task multi-behavior map-elites,” in Proceedings of the Companion Conference on Genetic and Evolutionary Computation, 2023, pp. 111–114

  22. [22]

    Learning transferable visual models from natural language supervision,

    A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clarket al., “Learning transferable visual models from natural language supervision,” inInternational conference on machine learning. PmLR, 2021, pp. 8748–8763

  23. [23]

    A comparison of illumination algorithms in unbounded spaces,

    V . Vassiliades, K. Chatzilygeroudis, and J.-B. Mouret, “A comparison of illumination algorithms in unbounded spaces,” inProceedings of the Genetic and Evolutionary Computation Conference Companion, 2017, pp. 1578–1581

  24. [24]

    Generational adversarial map-elites for multi-agent game illumination,

    T. Anne, N. Syrkis, M. Elhosni, F. Turati, F. Legendre, A. Jaquier, and S. Risi, “Generational adversarial map-elites for multi-agent game illumination,”Accepted for presentation at ALIFE ’25, Kyoto, Japan, 2025

  25. [25]

    Harnessing language for coordination: A framework and bench- mark for llm-driven multi-agent control,

    ——, “Harnessing language for coordination: A framework and bench- mark for llm-driven multi-agent control,”IEEE Transactions on Games, 2025

  26. [26]

    Evolution gym: A large-scale benchmark for evolving soft robots,

    J. Bhatia, H. Jackson, Y . Tian, J. Xu, and W. Matusik, “Evolution gym: A large-scale benchmark for evolving soft robots,”Advances in Neural Information Processing Systems, vol. 34, pp. 2201–2214, 2021

  27. [27]

    Hearthbreaker: A hearthstone simulator,

    Y . Danielet al., “Hearthbreaker: A hearthstone simulator,” 2014. [Online]. Available: https://github.com/danielyule/hearthbreaker

  28. [28]

    Challenges in coevolutionary learning: Arms-race dynamics, open-endedness, and mediocre stable states,

    S. G. Ficici and J. B. Pollack, “Challenges in coevolutionary learning: Arms-race dynamics, open-endedness, and mediocre stable states,” in Proceedings of the sixth international conference on Artificial life. MIT Press Cambridge, MA, 1998, pp. 238–247

  29. [29]

    Evolving complexity in prediction games,

    N. Moran and J. Pollack, “Evolving complexity in prediction games,” Artificial Life, vol. 25, no. 1, pp. 74–91, 2019

  30. [30]

    Escalation of memory length in finite populations,

    K. Harrington and J. Pollack, “Escalation of memory length in finite populations,”Artificial life, vol. 25, no. 1, pp. 22–32, 2019

  31. [31]

    Minimal criterion coevolution: a new approach to open-ended search,

    J. C. Brant and K. O. Stanley, “Minimal criterion coevolution: a new approach to open-ended search,” inProceedings of the Genetic and Evolutionary Computation Conference, 2017, pp. 67–74

  32. [32]

    Coevolution of neural networks for agents and environments,

    E. Chigot and D. G. Wilson, “Coevolution of neural networks for agents and environments,” inProceedings of the Genetic and Evolutionary Computation Conference Companion, 2022, pp. 2306–2309

  33. [33]

    Omni-epic: Open- endedness via models of human notions of interestingness with environ- ments programmed in code,

    M. Faldor, J. Zhang, A. Cully, and J. Clune, “Omni-epic: Open- endedness via models of human notions of interestingness with environ- ments programmed in code,” inThe Thirteenth International Conference on Learning Representations, 2024

  34. [34]

    Grand- master level in starcraft ii using multi-agent reinforcement learning,

    O. Vinyals, I. Babuschkin, W. M. Czarnecki, M. Mathieu, A. Dudzik, J. Chung, D. H. Choi, R. Powell, T. Ewalds, P. Georgievet al., “Grand- master level in starcraft ii using multi-agent reinforcement learning,” nature, vol. 575, no. 7782, pp. 350–354, 2019

  35. [35]

    Emergent tool use from multi-agent autocurricula,

    B. Baker, I. Kanitscheider, T. Markov, Y . Wu, G. Powell, B. McGrew, and I. Mordatch, “Emergent tool use from multi-agent autocurricula,” in International conference on learning representations, 2019

  36. [36]

    Evolving a diversity of virtual creatures through novelty search and local competition,

    J. Lehman and K. O. Stanley, “Evolving a diversity of virtual creatures through novelty search and local competition,” inProceedings of the 13th annual conference on Genetic and evolutionary computation, 2011, pp. 211–218

  37. [37]

    Illuminating search spaces by mapping elites

    J.-B. Mouret and J. Clune, “Illuminating search spaces by mapping elites,”arXiv preprint arXiv:1504.04909, 2015

  38. [38]

    Quality diversity for multi-task optimiza- tion,

    J.-B. Mouret and G. Maguire, “Quality diversity for multi-task optimiza- tion,” inProceedings of the 2020 Genetic and Evolutionary Computation Conference, 2020, pp. 121–129

  39. [39]

    Illuminating the space of beatable lode runner levels produced by various generative adversarial networks,

    K. Steckel and J. Schrum, “Illuminating the space of beatable lode runner levels produced by various generative adversarial networks,” in Proceedings of the genetic and evolutionary computation conference companion, 2021, pp. 111–112

  40. [40]

    Quality diversity imitation learning,

    Z. Wan, X. Yu, D. M. Bossens, Y . Lyu, Q. Guo, F. X. Fan, and I. Tsang, “Quality diversity imitation learning,”arXiv preprint 2410.06151, 2024

  41. [41]

    Multi-agent diagnostics for robustness via illuminated diversity,

    M. Samvelyan, D. Paglieri, M. Jiang, J. Parker-Holder, and T. Rockt¨aschel, “Multi-agent diagnostics for robustness via illuminated diversity,”arXiv preprint arXiv:2401.13460, 2024

  42. [42]

    Automating the search for artificial life with foundation models,

    A. Kumar, C. Lu, L. Kirsch, Y . Tang, K. O. Stanley, P. Isola, and D. Ha, “Automating the search for artificial life with foundation models,”arXiv preprint arXiv:2412.17799, 2024

  43. [43]

    Using Centroidal Voronoi Tessellations to Scale Up the Multi-dimensional Archive of Phenotypic Elites Algorithm

    V . Vassiliades, K. Chatzilygeroudis, and J.-B. Mouret, “Scaling up map-elites using centroidal voronoi tessellations,”arXiv preprint arXiv:1610.05729, 2016

  44. [44]

    Autonomous skill discovery with quality-diversity and unsu- pervised descriptors,

    A. Cully, “Autonomous skill discovery with quality-diversity and unsu- pervised descriptors,” inProceedings of the Genetic and Evolutionary Computation Conference, 2019, pp. 81–89

  45. [45]

    Dominated novelty search: Rethinking local competition in quality-diversity,

    R. Bahlous-Boldi, M. Faldor, L. Grillotti, H. Janmohamed, L. Coiffard, L. Spector, and A. Cully, “Dominated novelty search: Rethinking local competition in quality-diversity,” inProceedings of the Genetic and Evolutionary Computation Conference, 2025, pp. 104–112

  46. [46]

    JAX: composable transformations of Python+NumPy pro- grams,

    J. Bradbury, R. Frostig, P. Hawkins, M. J. Johnson, C. Leary, D. Maclau- rin, G. Necula, A. Paszke, J. VanderPlas, S. Wanderman-Milne, and Q. Zhang, “JAX: composable transformations of Python+NumPy pro- grams,” 2018

  47. [47]

    Colledanchise and P

    M. Colledanchise and P. ¨Ogren,Behavior trees in robotics and AI: An introduction. CRC Press, 2018

  48. [48]

    Learning behavior trees with genetic programming in unpredictable environments,

    M. Iovino, J. Styrud, P. Falco, and C. Smith, “Learning behavior trees with genetic programming in unpredictable environments,” in2021 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2021, pp. 4591–4597

  49. [49]

    A quality- diversity approach to evolving a repertoire of diverse behaviour-trees in robot swarms,

    K. Montague, E. Hart, G. Nitschke, and B. Paechter, “A quality- diversity approach to evolving a repertoire of diverse behaviour-trees in robot swarms,” inInternational Conference on the Applications of Evolutionary Computation. Springer, 2023, pp. 145–160

  50. [50]

    Evolving neural networks through augmenting topologies,

    K. O. Stanley and R. Miikkulainen, “Evolving neural networks through augmenting topologies,”Evolutionary computation, vol. 10, no. 2, pp. 99–127, 2002

  51. [51]

    A. E. Elo,The Rating of Chessplayers, Past and Present. Arco Publishing, 1978

  52. [52]

    The neutral theory of molecular evolution,

    M. Kimura, “The neutral theory of molecular evolution,”Scientific American, vol. 241, no. 5, pp. 98–129, 1979

  53. [53]

    Enhancing divergent search through extinction events,

    J. Lehman and R. Miikkulainen, “Enhancing divergent search through extinction events,” inProceedings of the 2015 Annual Conference on Genetic and Evolutionary Computation, 2015, pp. 951–958

  54. [54]

    C. G. Langton,Artificial life: An overview. Mit press, 1997

  55. [55]

    Evolving 3d morphology and behavior by competition,

    K. Sims, “Evolving 3d morphology and behavior by competition,” Artificial life, vol. 1, no. 4, pp. 353–372, 1994

  56. [56]

    Unshackling evolution: evolving soft robots with multiple materials and a powerful generative encoding,

    N. Cheney, R. MacCurdy, J. Clune, and H. Lipson, “Unshackling evolution: evolving soft robots with multiple materials and a powerful generative encoding,”ACM SIGEVOlution, vol. 7, no. 1, pp. 11–23, 2014

  57. [57]

    A mini- mal developmental model can increase evolvability in soft robots,

    S. Kriegman, N. Cheney, F. Corucci, and J. C. Bongard, “A mini- mal developmental model can increase evolvability in soft robots,” in Proceedings of the Genetic and Evolutionary Computation Conference, 2017, pp. 131–138

  58. [58]

    Scalable co- optimization of morphology and control in embodied machines,

    N. Cheney, J. Bongard, V . SunSpiral, and H. Lipson, “Scalable co- optimization of morphology and control in embodied machines,”Journal of The Royal Society Interface, vol. 15, no. 143, p. 20170937, 2018

  59. [59]

    Modular controllers facilitate the co- optimization of morphology and control in soft robots,

    A. Mertan and N. Cheney, “Modular controllers facilitate the co- optimization of morphology and control in soft robots,” inProceedings of the Genetic and Evolutionary Computation Conference, 2023, pp. 174–183

  60. [60]

    Enhancing adaptability in embodied agents: A multi-quality-diversity approach,

    G. Nadizar, E. Medvet, and D. G. Wilson, “Enhancing adaptability in embodied agents: A multi-quality-diversity approach,”IEEE Transac- tions on Evolutionary Computation, 2025

  61. [61]

    Intelligence without representation,

    R. A. Brooks, “Intelligence without representation,”Artificial intelli- gence, vol. 47, no. 1-3, pp. 139–159, 1991

  62. [62]

    Pfeifer and J

    R. Pfeifer and J. Bongard,How the body shapes the way we think: a new view of intelligence. MIT press, 2006

  63. [63]

    Monte carlo tree search experiments in hearthstone,

    A. Santos, P. A. Santos, and F. S. Melo, “Monte carlo tree search experiments in hearthstone,” in2017 IEEE conference on computational intelligence and games (CIG). IEEE, 2017, pp. 272–279. 15