ECo-MoE: Embodiment-Conditioned Mixture of Experts Increases the Evolvability of Robots
Pith reviewed 2026-06-30 15:24 UTC · model grok-4.3
The pith
A mixture of experts gated by latent robot design vectors allows efficient co-evolution of diverse bodies and controllers.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Co-optimizing distributions of latent design vectors and a mixture of control experts gated by those vectors creates a modular controller in which different phenotypes activate different expert combinations, enabling targeted updates and evo by demo to increase evolvability.
What carries the argument
The embodiment-conditioned mixture of experts gated by latent design coordinates of the decoded phenotype.
Load-bearing premise
Gating based on latent design coordinates sufficiently separates the influence of different morphologies so that expert updates do not interfere.
What would settle it
An observation that evolving a new robot design causes a decrease in performance for earlier designs that share overlapping expert activations.
Figures
read the original abstract
In this paper, we introduce a model of evolution and learning in robots that co-optimizes a distribution of latent design vectors (genotypes) and a mixture of control experts (neural modules), which are gated by the latent coordinates of each decoded design (phenotype). This provides a scalable alternative to co-design algorithms that either train an individual policy for every robot, which is inefficient, or a monolithic universal controller for all robots, which results in overly conservative structures and behaviors. Our approach lies somewhere between these two extremes, preserving ancestral knowledge in a unified yet modular framework in which different body plans activate and deactivate different combinations of learned sensorimotor circuits for goal-directed behavior. This allows one part of the controller to be overhauled to better suit new species of designs as they emerge without disrupting the hard-earned knowledge contained within other expert modules. It also allows pretrained expert policies to be directly plugged into the mixture, which can steer evolution into otherwise unexplored areas of latent space containing desired morphological traits. We refer to this process as "evo by demo" and explore how it may be used to guide freeform evolution toward canonical structures defined by the pretrained model. Videos and code can be found at: https://eco-moe.github.io.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces ECo-MoE, a co-optimization framework for robotic evolution that jointly evolves a distribution of latent design vectors (genotypes) and a mixture-of-experts controller whose modules are gated by the latent coordinates of each decoded morphology (phenotype). It positions the approach as an intermediate solution between per-robot policies and monolithic universal controllers, claiming that the modular structure preserves ancestral knowledge during updates to individual experts and enables 'evo by demo' by inserting pretrained expert policies to steer evolution toward desired morphological regions.
Significance. If the architecture functions as described, the modular gating could offer a practical route to scalable co-design that avoids both the computational cost of separate policies and the conservatism of single controllers, while the 'evo by demo' mechanism provides a concrete method for injecting prior knowledge into open-ended evolution. The conceptual separation of design-conditioned routing from expert internals is a potentially useful organizing principle for lifelong robot learning. However, the manuscript supplies no empirical results, ablation studies, or quantitative metrics, so any assessment of significance remains speculative.
major comments (3)
- Abstract: The central claim that the method 'increases the evolvability of robots' is stated without any supporting evidence. No simulation results, baseline comparisons, evolvability metrics, or even pseudocode appear in the provided text, leaving the performance assertions unsupported.
- Abstract: The gating mechanism ('latent design coordinates gate a mixture of control experts') and the assertion that 'one part of the controller can be overhauled ... without disrupting ... other expert modules' are described at a conceptual level only. No equations, network diagrams, or formal specification of the gating function or training procedure are supplied, preventing evaluation of whether the claimed isolation holds.
- Abstract: The 'evo by demo' procedure is introduced as a way to 'steer evolution into otherwise unexplored areas,' yet no implementation details, loss formulations, or experimental demonstrations of its effect on latent-space exploration are given.
minor comments (2)
- The manuscript would benefit from an explicit definition or citation for 'evolvability' and from a short related-work paragraph situating the mixture-of-experts gating relative to prior modular or conditional controllers in evolutionary robotics.
- The GitHub link is provided but no supplementary material (e.g., architecture diagrams or pseudocode) is referenced in the text itself.
Simulated Author's Rebuttal
We thank the referee for the careful reading and for highlighting the gaps between the abstract claims and the supporting material. We agree that the current manuscript version is primarily conceptual and does not yet contain the quantitative evidence, formal specifications, or experimental demonstrations needed to substantiate the stated benefits. We will perform a major revision that adds these elements.
read point-by-point responses
-
Referee: Abstract: The central claim that the method 'increases the evolvability of robots' is stated without any supporting evidence. No simulation results, baseline comparisons, evolvability metrics, or even pseudocode appear in the provided text, leaving the performance assertions unsupported.
Authors: We accept this criticism. The submitted manuscript introduces the ECo-MoE framework at a conceptual level but does not include the requested empirical validation. In the revision we will add simulation results, baseline comparisons (per-robot policies and monolithic controllers), and quantitative evolvability metrics together with pseudocode for the co-optimization loop. revision: yes
-
Referee: Abstract: The gating mechanism ('latent design coordinates gate a mixture of control experts') and the assertion that 'one part of the controller can be overhauled ... without disrupting ... other expert modules' are described at a conceptual level only. No equations, network diagrams, or formal specification of the gating function or training procedure are supplied, preventing evaluation of whether the claimed isolation holds.
Authors: We agree that the abstract alone is insufficient. The revised manuscript will include the mathematical definition of the gating function (conditioned on latent design coordinates), network architecture diagrams, and the precise training objective that isolates updates to individual experts. revision: yes
-
Referee: Abstract: The 'evo by demo' procedure is introduced as a way to 'steer evolution into otherwise unexplored areas,' yet no implementation details, loss formulations, or experimental demonstrations of its effect on latent-space exploration are given.
Authors: This observation is correct. The revision will supply the concrete implementation of expert insertion, the auxiliary loss used to bias the latent distribution, and experimental results quantifying the resulting change in morphological coverage. revision: yes
Circularity Check
No significant circularity; conceptual modeling proposal with no derivations
full rationale
The paper introduces a conceptual architecture for co-optimizing latent design vectors and a gated mixture of experts, positioned as an alternative to per-robot or monolithic controllers. No equations, derivations, fitted parameters, or self-citations appear in the abstract or description. The central claim is a modeling proposal rather than a completed empirical result or mathematical derivation that reduces to its inputs. No load-bearing steps exist that could exhibit self-definitional, fitted-input, or self-citation circularity.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Generative and Discriminative Voxel Modeling with Convolutional Neural Networks
Brock, A., Lim, T., Ritchie, J. M., and Weston, N. Genera- tive and discriminative voxel modeling with convolutional neural networks.arXiv preprint arXiv:1608.04236,
work page internal anchor Pith review Pith/arXiv arXiv
-
[2]
Creating man- ufacturable blueprints for coarse-grained virtual robots
Guo, Z., Li, M., Zhang, S., and Kriegman, S. Creating man- ufacturable blueprints for coarse-grained virtual robots. arXiv preprint arXiv:2603.13582,
-
[3]
The CMA Evolution Strategy: A Tutorial
Hansen, N. The CMA evolution strategy: A tutorial.arXiv preprint arXiv:1604.00772,
work page internal anchor Pith review Pith/arXiv arXiv
-
[4]
Schaff, C. and Walter, M. R. N-limb: Neural limb optimiza- tion for efficient morphological design.arXiv preprint arXiv:2207.11773,
-
[5]
Proximal Policy Optimization Algorithms
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., and Klimov, O. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347,
work page internal anchor Pith review Pith/arXiv arXiv
-
[6]
Task Setup Table 3 below provides the hyperparameters we used for reinforcement learning and evolutionary strategies
11 ECo-MoE: Embodiment-Conditioned Mixture of Experts Increases the Evolvability of Robots A. Task Setup Table 3 below provides the hyperparameters we used for reinforcement learning and evolutionary strategies. The pretrained V AE checkpoint, as well as the compiler and validity checks used to decode latent vectors into simulatable morphologies, were ado...
2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.