Runtime Analysis of the (μ + 1)-ES in a Homogenous Progress Model
Pith reviewed 2026-06-27 04:59 UTC · model grok-4.3
The pith
In the homogeneous progress model the (μ+1)-ES grows at rate (log^{1+o(1)} μ / μ) times the single-parent rate for Gaussian fitness shifts when μ ≤ e^δ.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The paper establishes that in the homogeneous progress model defined by an invariant improvement distribution Z, the expected growth rate R_μ of the continuous steady-state (μ+1)-ES can be bounded by constructing simpler modified processes that sandwich the original process. When Z equals the normal distribution N(-δ, 1) and μ is at most e^δ, this technique yields the exact asymptotic relation R_μ = (log^{1 + o(1)} μ / μ) R_1.
What carries the argument
Modified processes that sandwich the growth rate of the original (μ+1)-ES, allowing the true rate to be bounded between two more tractable quantities.
If this is right
- The growth rate of the (μ+1)-ES is asymptotically smaller than the (1+1)-ES rate by the factor log^{1+o(1)} μ divided by μ.
- The bound holds only while μ stays at most e^δ for a Gaussian shift of size δ.
- The model is intended for regimes far from the global optimum where exact fitness functions are intractable.
- The sandwiching technique handles the dependencies created by overlapping generations in plus-selection.
Where Pith is reading between the lines
- Similar sandwiching arguments might yield growth-rate bounds for other selection rules or non-Gaussian Z distributions.
- The scaling suggests that moderate population sizes could be preferable in practice once the logarithmic factor is taken into account.
- The homogeneous model supplies a simple baseline for predicting behavior in black-box settings such as hyperparameter search.
- The result could be tested by simulating the (μ+1)-ES on artificial landscapes whose improvement statistics are forced to remain stationary.
Load-bearing premise
The relative fitness improvement produced by any mutation is always drawn from the same fixed distribution Z no matter where the search currently stands.
What would settle it
Running an actual optimization problem and measuring whether the empirical distribution of fitness gains stays statistically constant across widely different fitness levels.
Figures
read the original abstract
We introduce a new simple model to study the fitness progress of Evolution Strategies (ES) in generic problems. In this model, we bypass the underlying fitness landscape and assume that the mutation of any individual produces an offspring whose fitness relative to the parent is given by an invariant distribution $Z$, such as a mean-shifted Gaussian. This serves as a prototypical model for the optimisation landscape when an evolution algorithm operates far from the global optimum. This simple model can be used to approximate the optimisation process for problems where it is intractable to model the exact fitness function, including tasks such as hyperparameter tuning in machine learning models. We rigorously analyse the expected growth rate $\mathcal{R}_{\mu}$ of the continuous steady-state $(\mu+1)$-ES in this model. Unlike comma-selection strategies, the steady-state $(\mu+1)$-ES maintains overlapping generations, introducing complex mathematical dependencies among surviving parents that make it harder to analyse. We give a general technique to analyse the the $(\mu + 1)$-ES by constructing modified processes whose growth rates provably sandwich that of the original process. These modified processes are then easier to analyse but still close enough to the true process to give a tight bound on the expected growth rate. When $Z = \mathcal{N}(-\delta, 1)$ and $\mu \le e^{\delta}$, we show that $\mathcal{R}_{\mu} = \frac{\log^{1 + o(1)} \mu}{\mu} \mathcal{R}_1$.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces the homogeneous progress model for Evolution Strategies, in which the relative fitness improvement of any mutation is drawn from an invariant distribution Z (e.g., a mean-shifted Gaussian) independent of current fitness or position. It analyzes the expected growth rate R_μ of the continuous steady-state (μ+1)-ES by constructing modified processes whose rates provably sandwich the original overlapping-generation process. For Z = N(-δ,1) and μ ≤ e^δ the paper claims the asymptotic R_μ = (log^{1+o(1)} μ / μ) R_1.
Significance. If the sandwiching argument and resulting asymptotic hold, the work supplies a rigorous, parameter-free scaling law for progress rates in a simplified model that approximates generic landscapes far from the optimum. The sandwiching technique itself is a methodological contribution for handling dependencies induced by overlapping generations. The model is positioned as a tool for intractable problems such as hyperparameter tuning.
major comments (2)
- [Abstract] Abstract: the central claim rests on a 'rigorous sandwiching argument' that yields the stated asymptotic together with explicit error terms and a treatment of the continuous steady-state; none of these derivations appear in sufficient detail to allow verification of the result beyond the high-level description.
- [Abstract] The modeling premise that Z is strictly invariant (independent of current fitness value or position) is definitional for the homogeneous progress model, yet the manuscript does not discuss the regime in which this approximation remains accurate or how deviations would affect the derived scaling.
minor comments (1)
- [Abstract] Notation: the symbol R_μ is introduced without an explicit equation defining it in terms of the steady-state distribution or expected improvement.
Simulated Author's Rebuttal
We thank the referee for the careful review and constructive comments. We address each major point below and outline the revisions we intend to make to strengthen the manuscript.
read point-by-point responses
-
Referee: [Abstract] Abstract: the central claim rests on a 'rigorous sandwiching argument' that yields the stated asymptotic together with explicit error terms and a treatment of the continuous steady-state; none of these derivations appear in sufficient detail to allow verification of the result beyond the high-level description.
Authors: The full derivations of the sandwiching construction, the proof that the modified processes bound the original overlapping-generation process, the analysis of the continuous steady-state, and the explicit error terms supporting the asymptotic appear in Sections 3 and 4. The abstract follows the conventional high-level style. To improve verifiability we will add a concise proof outline to the introduction and ensure every error term is stated explicitly with its derivation reference. revision: partial
-
Referee: [Abstract] The modeling premise that Z is strictly invariant (independent of current fitness value or position) is definitional for the homogeneous progress model, yet the manuscript does not discuss the regime in which this approximation remains accurate or how deviations would affect the derived scaling.
Authors: We agree that an explicit discussion of the model's validity regime is useful. In the revised version we will insert a dedicated paragraph (or short subsection) after the model definition that identifies the distance-from-optimum regime in which invariance of Z is a reasonable approximation on typical smooth landscapes and that briefly indicates how moderate deviations from invariance would be expected to affect the leading scaling term. revision: yes
Circularity Check
No significant circularity; derivation is self-contained within the defined model
full rationale
The paper defines the homogeneous progress model explicitly as an invariant distribution Z for relative fitness improvements, independent of position. It then derives the growth rate bound R_μ via a sandwiching argument on auxiliary processes whose rates are analyzed directly from the model definition. No parameter is fitted to data and then renamed as a prediction; no self-citation chain is invoked as a load-bearing uniqueness theorem; the central claim reduces to standard probabilistic analysis of the explicitly stated process rather than to its own inputs by construction. The modeling premise is definitional, not a hidden assumption that collapses the derivation.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Mutation of any individual produces an offspring whose fitness relative to the parent is given by an invariant distribution Z.
Reference graph
Works this paper leans on
-
[1]
In: Proceedings of the Genetic and Evolutionary Computation Conference
Akimoto, Y., Auger, A., Glasmachers, T.: Drift theory in continuous search spaces: Expected hitting time of the(1 + 1)-ES with 1/5 success rule. In: Proceedings of the Genetic and Evolutionary Computation Conference. pp. 801–808 (2018)
2018
-
[2]
Theoretical Computer Science334(1-3), 35–69 (2005)
Auger, A.: Convergence results for the(1, λ)-SA-ES using the theory ofϕ- irreducible Markov chains. Theoretical Computer Science334(1-3), 35–69 (2005)
2005
-
[3]
Electronic Journal of Probability19(22), 1–17 (2014)
Bérard, J., Maillard, P.: The limiting process of N-particle branching random walk with polynomial tails. Electronic Journal of Probability19(22), 1–17 (2014)
2014
-
[4]
Springer Berlin Heidelberg (2001)
Beyer, H.G.: The theory of Evolution Strategies. Springer Berlin Heidelberg (2001)
2001
-
[5]
Physical Review E 73(5), 056126 (2006)
Brunet, E., Derrida, B., Mueller, A.H., Munier, S.: Phenomenological theory giving the full statistics of the position of fluctuating pulled fronts. Physical Review E 73(5), 056126 (2006)
2006
-
[6]
Physical Review E76(4), 041104 (2007)
Brunet, E., Derrida, B., Mueller, A.H., Munier, S.: Effect of selection on ancestry: An exactly soluble case and its phenomenological generalization. Physical Review E76(4), 041104 (2007)
2007
-
[7]
In: Proceedings of the Genetic and Evolutionary Computation Conference
Dang, D.C., Lehre, P.K.: The SLO hierarchy of pseudo-Boolean functions and run- time of evolutionary algorithms. In: Proceedings of the Genetic and Evolutionary Computation Conference. pp. 1551–1559 (2024)
2024
-
[8]
Springer Nature (2019)
Doerr, B., Neumann, F.: Theory of evolutionary computation: Recent develop- ments in discrete optimization. Springer Nature (2019)
2019
-
[9]
Gissler, A.: Linear convergence of Evolution Strategies with covariance matrix adaptation. Ph.D. thesis, Institut Polytechnique de Paris (2024) 22 J. Lengler, R. Ravi
2024
-
[10]
Evolutionary Computation28(1), 27–53 (2020)
Glasmachers, T.: Global convergence of the(1 + 1)Evolution Strategy to a critical point. Evolutionary Computation28(1), 27–53 (2020)
2020
-
[11]
arXiv preprint arXiv:1604.00772 (2016)
Hansen, N.: The CMA Evolution Strategy: A tutorial. arXiv preprint arXiv:1604.00772 (2016)
Pith/arXiv arXiv 2016
-
[12]
Evolutionary Computation9(2), 159–195 (2001)
Hansen, N., Ostermeier, A.: Completely derandomized self-adaptation in Evolution Strategies. Evolutionary Computation9(2), 159–195 (2001)
2001
-
[13]
Theoretical Computer Science361(1), 38–56 (2006)
Jägersküpper, J.: How the(1 + 1)ES using isotropic mutations minimizes positive definite quadratic forms. Theoretical Computer Science361(1), 38–56 (2006)
2006
-
[14]
In: Proceedings of the Genetic and Evolutionary Computation Conference
Jägersküpper, J.: Probabilistic runtime analysis of(1+, λ)ES using isotropic muta- tions. In: Proceedings of the Genetic and Evolutionary Computation Conference. pp. 461–468 (2006)
2006
-
[15]
In: Proceedings of the Genetic and Evolutionary Computation Conference
Jägersküpper, J., Witt, C.: Rigorous runtime analysis of a(µ+ 1)ES for the sphere function. In: Proceedings of the Genetic and Evolutionary Computation Conference. pp. 849–856 (2005)
2005
-
[16]
In: Proceedings of the International Conference on Intelligent Computing
Jiang, W., Qian, C., Tang, K.: Improved running time analysis of the(1 + 1)-ES on the sphere function. In: Proceedings of the International Conference on Intelligent Computing. pp. 729–739. Springer (2018)
2018
-
[17]
In: Theory of evolutionary computation: Recent devel- opments in discrete optimization, pp
Lengler, J.: Drift analysis. In: Theory of evolutionary computation: Recent devel- opments in discrete optimization, pp. 89–131. Springer (2019)
2019
-
[18]
IEEE Transactions on Evolutionary Computation24(6), 995–1009 (2019)
Lengler,J.:Ageneraldichotomyofevolutionaryalgorithmsonmonotonefunctions. IEEE Transactions on Evolutionary Computation24(6), 995–1009 (2019)
2019
-
[19]
In: Proceedings of the Evolutionary Computation in Combinatorial Optimization
Lengler, J., Riedi, S.: Runtime analysis of the(µ+ 1)-EA on the dynamic Bin- Val function. In: Proceedings of the Evolutionary Computation in Combinatorial Optimization. pp. 84–99. Springer (2021)
2021
-
[20]
In: Proceedings of the Foundations of Genetic Algorithms
Lengler, J., Zou, X.: Exponential slowdown for larger populations: The(µ+ 1)-EA on monotone functions. In: Proceedings of the Foundations of Genetic Algorithms. pp. 87–101 (2019)
2019
-
[21]
Tech- nometrics6(1), 101–102 (1964)
Marsaglia, G.: Generating a variable from the tail of the normal distribution. Tech- nometrics6(1), 101–102 (1964)
1964
-
[22]
IEEE Trans- actions on Evolutionary Computation28(2), 501–515 (2023)
Morinaga, D., Fukuchi, K., Sakuma, J., Akimoto, Y.: Convergence rate of the (1 + 1)-ES on locally strongly convex and Lipschitz smooth functions. IEEE Trans- actions on Evolutionary Computation28(2), 501–515 (2023)
2023
-
[23]
In: Proceedings of the Workshop on Simula- tionsmethoden in der Medizin und Biologie
Rechenberg, I.: Evolutionsstrategien. In: Proceedings of the Workshop on Simula- tionsmethoden in der Medizin und Biologie. pp. 83–114. Springer (1978)
1978
-
[24]
Biometrika41(1/2), 177–189 (1954)
Shenton, L.: Inequalities for the normal integral including a new continued fraction. Biometrika41(1/2), 177–189 (1954)
1954
-
[25]
Shi, Z., et al.: Branching random walks, vol. 2151. Springer (2015)
2015
-
[26]
Evolutionary Computation14(1), 65–86 (2006)
Witt, C.: Runtime analysis of the(µ+ 1)EA on simple pseudo-Boolean functions. Evolutionary Computation14(1), 65–86 (2006)
2006
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.