LLM-guided evolutionary search yields the first domain-independent C++ planning heuristics that exceed the strongest hand-engineered baselines on coverage and speed trade-offs across unseen domains.
hub Mixed citations
Illuminating search spaces by mapping elites
Mixed citation behavior. Most common role is background (64%).
abstract
Many fields use search algorithms, which automatically explore a search space to find high-performing solutions: chemists search through the space of molecules to discover new drugs; engineers search for stronger, cheaper, safer designs, scientists search for models that best explain data, etc. The goal of search algorithms has traditionally been to return the single highest-performing solution in a search space. Here we describe a new, fundamentally different type of algorithm that is more useful because it provides a holistic view of how high-performing solutions are distributed throughout a search space. It creates a map of high-performing solutions at each point in a space defined by dimensions of variation that a user gets to choose. This Multi-dimensional Archive of Phenotypic Elites (MAP-Elites) algorithm illuminates search spaces, allowing researchers to understand how interesting attributes of solutions combine to affect performance, either positively or, equally of interest, negatively. For example, a drug company may wish to understand how performance changes as the size of molecules and their cost-to-produce vary. MAP-Elites produces a large diversity of high-performing, yet qualitatively different solutions, which can be more helpful than a single, high-performing solution. Interestingly, because MAP-Elites explores more of the search space, it also tends to find a better overall solution than state-of-the-art search algorithms. We demonstrate the benefits of this new algorithm in three different problem domains ranging from producing modular neural networks to designing simulated and real soft robots. Because MAP- Elites (1) illuminates the relationship between performance and dimensions of interest in solutions, (2) returns a set of high-performing, yet diverse solutions, and (3) improves finding a single, best solution, it will advance science and engineering.
hub tools
citation-role summary
citation-polarity summary
representative citing papers
Promptbreeder evolves both task prompts and the mutation prompts that improve them using LLMs, outperforming Chain-of-Thought and Plan-and-Solve on arithmetic and commonsense reasoning benchmarks.
No continuous utility-preserving input wrapper can eliminate all prompt injection risks in connected prompt spaces for language models.
Language-model-guided program synthesis can approximate transformer attention heads with over 75% IoU fidelity on held-out data and allow replacing 25% of heads with only 16% average perplexity increase.
Presents a query-complexity framework for genetic algorithms with guided operators and shows necessity of multiple operators and tight bounds for diversity in solution pools.
FML-Bench shows a simple greedy hill-climber nearly matches tree search on dense-opportunity tasks while an adaptive agent that broadens search on stagnation outperforms six baselines across 18 tasks.
DRSR uses Quality-Diversity to produce diverse symbolic regression expressions differing in residual distributions, enabling post-search selection on synthetic and astronomical data.
FrontierSmith automates synthesis of open-ended coding problems from closed-ended seeds and shows measurable gains on two open-ended LLM coding benchmarks.
PPol uses LLM-driven evolutionary program search to create diverse human-like user personas for simulators, yielding 33-62% fitness gains and +17% agent task success on retail and airline domains.
EvoPref applies NSGA-II evolutionary optimization with archive-based diversity to populations of LoRA adapters, yielding 18% higher preference coverage and 47% lower collapse than gradient descent baselines while matching alignment quality.
EvolveSignal applies LLM-driven evolutionary program synthesis to discover heuristic variations of traffic signal control logic that reduce delay and stops compared to Webster's method in simulation.
Meta Agent Search uses a meta-agent to iteratively program novel agentic systems in code, producing agents that outperform state-of-the-art hand-designed ones across coding, science, and math while transferring across domains and models.
EvoPrompt uses LLMs to run evolutionary operators on populations of prompts, outperforming human-engineered prompts by up to 25% on BIG-Bench Hard tasks across 31 datasets.
Phenotypic distance from output differences on fixed inputs enables surrogate models that predict performance of variable-topology neural networks as well as or better than weight-based models on fixed topologies in a robotic navigation task.
External evolution beats internal deliberation in collective-action tasks with statistical significance but neither helps in trading, and deliberation never discovers punishment while evolution does.
Autopoiesis uses LLM-driven program synthesis to evolve serving policies online during deployment, delivering up to 53% and average 34% gains over prior LLM serving systems under runtime dynamics.
LensAgent is a training-free LLM agent framework that reconstructs mass distributions in SLACS strong lensing systems to extract sub-galactic substructures.
AlphaEvolve is an LLM-orchestrated evolutionary coding agent that discovered a 4x4 complex matrix multiplication algorithm using 48 scalar multiplications, the first improvement over Strassen's algorithm in 56 years, plus optimizations for Google data centers and hardware.
Mastermind's dual-loop planner learns transferable strategies via SFT and milestone GRPO, raising GPT-5.5 executor pass rate on 200 held-out CyberGym tasks from 60% to 84.5%.
MFEA-CoD coordinates novelty search tasks with repulsion and adaptive transfer to collaboratively discover diverse novel solutions across synthetic, maze, MuJoCo, and generative problems.
Heuresis evaluates six search strategies for autonomous ML research agents and finds that novel ideas are rare, none rated original, and only one reaches top-10 quality while strategies steer axes but do not expand the quality-novelty frontier.
AIChilles finds 49 distinct hidden weaknesses across 30 AI-evolved programs in five applications by combining workload extraction, agent-based constraint inference, differential oracles, and coverage to expose regressions.
SRC is a fixed-horizon branch review framework for imitation learning in resettable web environments that collects 977 verifier-passing trajectories and 9,183 next-action examples while improving recovery-versus-query tradeoff over step-level review.
SV-QD-RL couples actor structure with branch-specific value learning via structure-conditioned actor-critic branches to generate diverse high-quality policy repertoires in QD-RL.
citing papers explorer
-
Uncovering Linguistic Fragility in Vision-Language-Action Models via Diversity-Aware Red Teaming
DAERT generates diverse adversarial instructions via a uniform policy in RL to drop VLA task success rates from 93.33% to 5.85% on benchmarks with models like π0 and OpenVLA.
-
Self-Evolving Agents with Anytime-Valid Certificates
SEA architecture gates self-modifications via anytime-valid certificates on a frozen base model plus five verifier mechanisms, yielding +4 to +5 gains on a SWE-bench subset for two strong bases.
-
TacEvo: Self-Evolving Architecture Discovery for Robotic Tactile Perception via LLM-Driven Quality-Diversity Search
TacEvo is an LLM-driven self-evolving search method that discovers neural architectures for robotic tactile force regression and grating classification, reporting fitness gains of 56.1% and 96.1% over 20 generations.
-
The Red Queen G\"odel Machine: Co-Evolving Agents and Their Evaluators
RQGM enables co-evolution of agents and evaluators across epochs with non-stationary utilities, reporting gains in coding pass rates, paper acceptance, and proof grading over prior self-improving agents.
-
EEVEE: Towards Test-time Prompt Learning in the Real World for Self-Improving Agents
EEVEE introduces a router-based multi-dataset test-time prompt learning framework for LLM agents that uses router-prompt co-evolution to improve robustness on heterogeneous data streams.
-
Generation of Diverse and Functional Robot Designs using Superquadrics Parametrisation and Quality-Diversity
Superquadrics parametrization combined with MAP-Elites produces the highest QD-score for diverse and functional robot designs across two test environments compared to CPPN and standard EA baselines.
-
Quality-Diversity Search in Sound Generation: Investigating Innovation Engines for Audio Exploration
MAP-Elites with CPPNs, DSP graphs, and a deep classifier produces diverse synthetic sounds across durations and musical/non-musical contexts.
-
Learning to Solve and Optimize by Evolving Code
CHECKMATE evolves correct high-performing solvers from formal specs and natural language descriptions, outperforming SOTA on configuration and scheduling problems.
-
GEAR: Genetic AutoResearch for Agentic Code Evolution
GEAR applies genetic algorithms to maintain and evolve multiple research states in autonomous code agents, outperforming single-path baselines by continuing to discover improvements over extended runs.
-
Evolving the Hearthstone Meta
An evolutionary algorithm searches for minimal card attribute changes in Hearthstone to balance deck win rates near 50%.
-
CVEvolve: Autonomous Algorithm Discovery for Unstructured Scientific Data Processing
CVEvolve is a zero-code LLM agent harness that autonomously discovers algorithms for scientific image tasks including registration, peak detection, and segmentation, reporting improvements over baselines via iterative search and holdout evaluation.
-
Distributional Value Estimation Without Target Networks for Robust Quality-Diversity
QDHUAC is a distributional, target-free QD-RL method that enables stable high-UTD training and competitive performance on Brax locomotion tasks using far fewer environment steps than prior approaches.
-
CVT Archives and Chemical Embedding Measures for Multi-Objective Quality Diversity in Molecular Design
CVT archives with learned chemical embeddings improve median global hypervolume and multi-objective quality diversity in NLO molecular design compared to grid-based archives.
-
A Compositional Framework for Open-ended Intelligence
Open-ended intelligence is formalized as the compositional closure L(P,C) of primitives P under operators C, with next primitive prediction proposed as an objective to acquire reusable primitives and grammar for lifelong adaptation.
-
An Empirical Audit of k-NAF Budget Accounting for Anchored Decoding
Empirical audit of k-NAF in Anchored Decoding finds budgets are not exhausted on tested workloads, with high proxy ratios attributable to small-sample artifacts.
-
MadEvolve: Evolutionary Optimization of Trading Systems with Large Language Models
MadEvolve uses LLMs for evolutionary optimization of trading strategies and reports significant backtest improvements on Bitcoin tasks including signal feature evolution and joint strategy optimization.
-
Artificial Adaptive Intelligence: The Missing Stage Between Narrow and General Intelligence
Proposes Artificial Adaptive Intelligence as the regime between narrow and general AI, defined by elimination of human-specified hyperparameters, and introduces an adaptivity index plus parametric minimality principle grounded in minimum description length.
-
Effective Harness Engineering for Algorithm Discovery with Coding Agents
Under fixed token budget on Circle Packing, deeper per-candidate reasoning beats generating more shallow candidates, and capable models produce evaluation hacks at higher rates.
-
Multi-Objective Evolutionary Design of Molecules with Enhanced Nonlinear Optical Properties
Evolutionary algorithms can discover molecules with improved nonlinear optical properties by simultaneously optimizing hyperpolarizability ratio, HOMO-LUMO gap, polarizability, and energy per atom.
-
Cultivating Machine Intelligence: The OMEGA Shift from Top-Down Optimization to Autopoietic Cognitive Ecologies
The paper introduces the RECLAIM framework and OMEGA shift as a transition from top-down optimization to autopoietic cognitive ecologies for cultivating machine intelligence.
-
The Many AI Challenges of Hearthstone
The paper surveys AI challenges in Hearthstone to illustrate the broader field of AI and games research through in-depth analysis of a single game.
-
Spiking Neural Network Architecture Search: A Survey
A survey of Spiking Neural Network architecture search techniques viewed through a hardware/software co-design lens.
- Multi-Task Optimization over Networks of Tasks
- Beyond Single-Model Optimization: Preserving Plasticity in Continual Reinforcement Learning