pith. sign in

arxiv: 2606.25761 · v1 · pith:OY2SLGUBnew · submitted 2026-06-24 · 💻 cs.LG · math.OC

Bridging Spherical Black-Box Optimizers

Pith reviewed 2026-06-25 20:30 UTC · model grok-4.3

classification 💻 cs.LG math.OC
keywords black-box optimizationevolution strategiesconsensus-based optimizationoptimization via integrationhybrid optimizersfitness aggregationconsensus scopecontinuous control
0
0 comments X

The pith

ES, CBO and OVI black-box optimizers differ mainly by fitness aggregation and consensus scope, enabling hybrids that interpolate their behaviors.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper unifies Evolution Strategies, Consensus-Based Optimization and Optimization via Integration inside one theoretical framework for gradient-free optimization. It argues that the methods vary along two axes: fitness aggregation, which sets preference for sharp versus flat solutions, and consensus scope, which sets how many particles or parameters participate in the update. From this view the authors construct hybrid algorithms, including an ES-OVI variant that lets users dial the flat-minima bias and CBO-OVI variants that mix parametric efficiency with particle-based multimodality. Experiments on continuous-control tasks and language-model merging show the hybrids can exceed the performance of the parent methods under fixed evaluation budgets.

Core claim

We unify these approaches within a common theoretical framework, revealing that they differ primarily in two design choices: fitness aggregation (controlling sharpness preference) and consensus scope (controlling modality). Leveraging these insights, we introduce hybrid optimizers that interpolate between existing methods.

What carries the argument

The two design axes of fitness aggregation and consensus scope that parameterize a family of spherical black-box optimizers and allow construction of interpolating hybrids.

If this is right

  • ES-OVI hybrids give explicit control over preference for flat minima and thereby trade performance against robustness on continuous control tasks.
  • CBO-OVI hybrids combine the sample efficiency of parametric updates with the multimodal search of particle methods, producing competitive results on language-model merging under tight evaluation limits.
  • On standard BBO benchmarks and higher-dimensional locomotion tasks the hybrids can outperform the original constituent algorithms.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If the two-axis view is complete, the same parameterization could be used to generate additional hybrids that target other trade-offs not yet explored in the paper.
  • The framework suggests that tuning aggregation and scope separately may offer a more interpretable alternative to hand-crafted optimizer variants for new problem classes.
  • Extending the unification to methods outside the spherical family could reveal whether similar axes govern their design choices.

Load-bearing premise

The main distinctions among ES, CBO, OVI and related methods are captured exactly by the two axes of fitness aggregation and consensus scope.

What would settle it

A controlled test in which an ES-OVI or CBO-OVI hybrid loses a core property of one parent method that cannot be restored by any setting of the two axes.

Figures

Figures reproduced from arXiv: 2606.25761 by Johannes Ackermann, Stefano Peluchetti.

Figure 1
Figure 1. Figure 1: We investigate connections between the parametric ES, OVI, the nonparametric CBO, and further related optimizers. By utilizing these connections, we can derive hybrid methods, indi￾cated by green arrows, that combine the strengths of existing opti￾mizers: ES-OVI allows us to control convergence characteristics, SchedPol and AdaPol combine CBO and OVI updates, allowing us to obtain multiple optima in higher… view at source ↗
Figure 3
Figure 3. Figure 3: ES-OVI lets us control the flatness of the optimum. Markers indicate the minimum of J α on the Rosenbrock function for different values of α, yellow (α = 0, ES) to red (α = 1, OVI). as illustrated by the Rosenbrock example in [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: AdaPol improves upon CBO in higher dimensional tasks. OVI is competitive with CMA-ES. Evaluation on Brax tasks with 10 seeds each. The shaded areas show 95% CIs across 10 seeds, hyper-parameters were optimized for each method per each task. thus combine both approaches to obtain a method capable of finding multiple optima in higher-dimensional problems. AdaPol & SchedPol Since CH is equivalent to OVI, we c… view at source ↗
Figure 5
Figure 5. Figure 5: Hybrid optimizers can outperform base methods. Evaluation on a selection of 2D BBO tasks. Note that in mul￾tiple problems, ES-OVI performs better than either ES or OVI. The shaded areas show 95% CIs across 10 random seeds, hyper￾parameters were optimized for each method in each problem. 5.1. Benchmarks To our knowledge, neither CBO nor OVI has been evaluated on the popular BBO Benchmark (BBOB) (Hansen et a… view at source ↗
Figure 6
Figure 6. Figure 6: ES-OVI allows us to tune the optimization behavior for each environment. Performance on Brax of ES-OVI with different interpolation-coefficients α. α = 1.0 corresponds to OVI, α = 0.0 corresponds to ES. 20 trials, 90% CIs. Black-Box Optimization Benchmark For low￾dimensional problems, we evaluate each method on 23 of the BBOB benchmark problems (Hansen et al., 2009). The BBOB benchmark consists of a series… view at source ↗
Figure 7
Figure 7. Figure 7: ES-OVI allows us to trade performance vs robust￾ness. Robustness on the Acrobot task, trained with ES-OVI with different α values. Mean across 20 different random seeds with bootstrapped 95% CIs. robustness than either OVI or ES under strong action or observation disturbances. This experiment also gives us some guidance on how we can approach picking the hyper￾parameter α: When we have less confidence in t… view at source ↗
Figure 9
Figure 9. Figure 9: Results on BBOB tasks in 2D 19 [PITH_FULL_IMAGE:figures/full_fig_p019_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Results on BBOB tasks in 5D 20 [PITH_FULL_IMAGE:figures/full_fig_p020_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Results on BBOB tasks in 7D 21 [PITH_FULL_IMAGE:figures/full_fig_p021_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Results on BBOB tasks in 10D 22 [PITH_FULL_IMAGE:figures/full_fig_p022_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Results on BBOB tasks in 15D 23 [PITH_FULL_IMAGE:figures/full_fig_p023_13.png] view at source ↗
Figure 14
Figure 14. Figure 14 [PITH_FULL_IMAGE:figures/full_fig_p024_14.png] view at source ↗
read the original abstract

When gradient information is unavailable, black-box optimization (BBO) methods provide a practical alternative. While Evolution Strategies (ES), Consensus-Based Optimization (CBO), Optimization via Integration (OVI), and related methods have each been studied independently, their connections remain underexplored. We unify these approaches within a common theoretical framework, revealing that they differ primarily in two design choices: fitness aggregation (controlling sharpness preference) and consensus scope (controlling modality). Leveraging these insights, we introduce hybrid optimizers that interpolate between existing methods. Our ES-OVI hybrid allows explicit control over the preference for flat minima, enabling a trade-off between performance and robustness in continuous control tasks. Our CBO-OVI hybrids combine the higher-dimensional efficiency of parametric methods with the multimodal capabilities of particle-based approaches, achieving competitive results on language model merging under limited evaluation budgets. We validate our methods on standard BBO benchmarks and higher-dimensional locomotion tasks, demonstrating that the hybrid methods can outperform their constituent algorithms.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper claims to unify Evolution Strategies (ES), Consensus-Based Optimization (CBO), Optimization via Integration (OVI) and related spherical black-box methods in a common theoretical framework. The unification identifies two primary design axes—fitness aggregation (governing sharpness preference) and consensus scope (governing modality)—as the main distinctions among the methods. It then constructs hybrid optimizers (ES-OVI and CBO-OVI) that interpolate along these axes, with the ES-OVI hybrid providing explicit control over flat-minima preference and the CBO-OVI hybrids combining parametric efficiency with particle-based multimodality. Empirical results on standard BBO benchmarks and higher-dimensional locomotion tasks are reported to show that the hybrids can outperform their constituent algorithms.

Significance. If the two-axis unification is shown to be faithful and the hybrids preserve the essential properties of the source methods while delivering the claimed trade-offs, the work would offer a principled route to designing new black-box optimizers. This could be particularly useful for continuous-control and high-dimensional tasks where robustness to modality and preference for flat minima matter, and where evaluation budgets are limited.

major comments (2)
  1. [Abstract / §3 (theoretical framework)] The central unification claim rests on the assertion that ES, CBO and OVI differ primarily along the two axes of fitness aggregation and consensus scope. Without explicit mappings (e.g., how the update rules or objective functionals of each method are recovered as special cases of the proposed framework), it is impossible to verify whether the hybrids truly interpolate without losing essential algorithmic properties.
  2. [§4 (hybrid construction) and experimental section] The ES-OVI hybrid is said to enable explicit control over flat-minima preference. The manuscript should demonstrate that this control is achieved without introducing additional free parameters beyond those already present in the constituent methods, and that the resulting performance-robustness trade-off is not an artifact of hyper-parameter tuning.
minor comments (2)
  1. [Abstract] The abstract refers to “spherical” black-box optimizers; the manuscript should clarify whether this refers to a specific geometric constraint on the search space or is simply descriptive of the methods considered.
  2. [Experimental results] When reporting outperformance on locomotion tasks, the number of independent runs, statistical significance tests, and exact evaluation budgets should be stated explicitly so that the claimed superiority of the hybrids can be assessed.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and constructive feedback. We address the two major comments below and will revise the manuscript accordingly to strengthen the presentation of the unification and the hybrid constructions.

read point-by-point responses
  1. Referee: [Abstract / §3 (theoretical framework)] The central unification claim rests on the assertion that ES, CBO and OVI differ primarily along the two axes of fitness aggregation and consensus scope. Without explicit mappings (e.g., how the update rules or objective functionals of each method are recovered as special cases of the proposed framework), it is impossible to verify whether the hybrids truly interpolate without losing essential algorithmic properties.

    Authors: We agree that explicit recovery of the base methods would make the unification more verifiable. Section 3 already frames the two axes and derives the general update, but we will add a dedicated subsection with the precise algebraic mappings showing how the standard ES, CBO, and OVI update rules (and their objective functionals) arise as special cases. This will also confirm that the proposed hybrids remain within the same family and preserve the core algorithmic properties. revision: yes

  2. Referee: [§4 (hybrid construction) and experimental section] The ES-OVI hybrid is said to enable explicit control over flat-minima preference. The manuscript should demonstrate that this control is achieved without introducing additional free parameters beyond those already present in the constituent methods, and that the resulting performance-robustness trade-off is not an artifact of hyper-parameter tuning.

    Authors: The flat-minima preference in ES-OVI is governed by the fitness-aggregation parameter already present in both base methods; no new free parameters are introduced. We will revise §4 to include an explicit parameter-correspondence table and add targeted ablation experiments in which only the aggregation parameter is varied while all other hyperparameters remain fixed at the values used for the constituent algorithms. These results will be reported to show that the observed trade-off is attributable to the design axis rather than additional tuning. revision: yes

Circularity Check

0 steps flagged

No significant circularity; unification framework is self-contained

full rationale

The paper's central claim is a unification of ES, CBO, OVI and related methods via two axes (fitness aggregation and consensus scope), followed by construction of hybrid optimizers. The provided abstract and text contain no equations, no fitted parameters renamed as predictions, no self-citations invoked as load-bearing uniqueness theorems, and no derivations that reduce outputs to inputs by construction. The framework is presented as an organizing lens that enables interpolation, without any step where the claimed differences or hybrids are defined circularly in terms of themselves. This is the normal case of an independent conceptual contribution.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only input provides no identifiable free parameters, axioms, or invented entities; full text would be required to audit these.

pith-pipeline@v0.9.1-grok · 5694 in / 1040 out tokens · 23262 ms · 2026-06-25T20:30:37.916403+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

47 extracted references · 3 linked inside Pith

  1. [1]

    Evolutionary optimization of model merging recipes

    Akiba, T., Shing, M., Tang, Y., Sun, Q., and Ha, D. Evolutionary optimization of model merging recipes. Nature Machine Intelligence, 7: 0 195--204, 2025

  2. [2]

    Gradient-free optimization via integration, 2024

    Andrieu, C., Chopin, N., Fincato, E., and Gerber, M. Gradient-free optimization via integration, 2024. URL https://arxiv.org/abs/2408.00888

  3. [3]

    shisa-gamma-7b-v1, 2023

    augmxnt. shisa-gamma-7b-v1, 2023. URL https://huggingface.co/augmxnt/shisa-gamma-7b-v1

  4. [4]

    J., Leary, C., Maclaurin, D., Necula, G., Paszke, A., Vander P las, J., Wanderman- M ilne, S., and Zhang, Q

    Bradbury, J., Frostig, R., Hawkins, P., Johnson, M. J., Leary, C., Maclaurin, D., Necula, G., Paszke, A., Vander P las, J., Wanderman- M ilne, S., and Zhang, Q. JAX : composable transformations of P ython+ N um P y programs, 2018. URL http://github.com/jax-ml/jax

  5. [5]

    V., Lange, R

    Braun, C. V., Lange, R. T., and Toussaint, M. Stein Variational Evolution Strategies . In UAI . arXiv, 2025. URL https://proceedings.mlr.press/v286/braun25a.html

  6. [6]

    Polarized consensus-based dynamics for optimization and sampling

    Bungert, L., Roith, T., and Wacker, P. Polarized consensus-based dynamics for optimization and sampling. Mathematical Programming, 211: 0 125--155, 2025

  7. [7]

    A., Jin, S., Li, L., and Zhu, Y

    Carrillo, J. A., Jin, S., Li, L., and Zhu, Y. A consensus-based global optimization method for high dimensional machine learning problems. ESAIM: Control, Optimisation and Calculus of Variations, 27: 0 S5, 2021

  8. [8]

    Generative ai for math: Abel

    Chern, E., Zou, H., Li, X., Hu, J., Feng, K., Li, J., and Liu, P. Generative ai for math: Abel. https://github.com/GAIR-NLP/abel, 2023

  9. [9]

    Sharpness- Aware Minimization for Efficiently Improving Generalization

    Foret, P., Kleiner, A., Mobahi, H., and Neyshabur, B. Sharpness- Aware Minimization for Efficiently Improving Generalization . In ICLR , 2021

  10. [10]

    D., Frey, E., Raichuk, A., Girgin, S., Mordatch, I., and Bachem, O

    Freeman, C. D., Frey, E., Raichuk, A., Girgin, S., Mordatch, I., and Bachem, O. Brax -- A Differentiable Physics Engine for Large Scale Rigid Body Simulation . In NeurIPS Datasets and Benchmarks Track, 2021

  11. [11]

    The language model evaluation harness, 07 2024

    Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac'h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., and Zou, A. The language model evaluation harness, 07 2024. URL https://zenodo.or...

  12. [12]

    Monte Carlo Methods in Financial Engineering , volume 53 of Stochastic Modelling and Applied Probability

    Glasserman, P. Monte Carlo Methods in Financial Engineering , volume 53 of Stochastic Modelling and Applied Probability. Springer, New York, 2004

  13. [13]

    Glynn, P. W. Likelihood ratio gradient estimation for stochastic systems. Commun. ACM, 33 0 (10): 0 75--84, 1990

  14. [14]

    The CMA Evolution Strategy : A Tutorial , 2016

    Hansen, N. The CMA Evolution Strategy : A Tutorial , 2016. URL https://arxiv.org/abs/1604.00772

  15. [15]

    Real- Parameter Black - Box Optimization Benchmarking 2009: Noiseless Functions Definitions

    Hansen, N., Finck, S., Ros, R., and Auger, A. Real- Parameter Black - Box Optimization Benchmarking 2009: Noiseless Functions Definitions . Research Report RR-6829, INRIA, 2009. URL https://inria.hal.science/inria-00362633

  16. [16]

    and Schmidhuber, J

    Hochreiter, S. and Schmidhuber, J. Flat Minima . Neural Computation, 9 0 (1): 0 1--42, 1997

  17. [17]

    T., Wortsman, M., Gururangan, S., Schmidt, L., Hajishirzi, H., and Farhadi, A

    Ilharco, G., Ribeiro, M. T., Wortsman, M., Gururangan, S., Schmidt, L., Hajishirzi, H., and Farhadi, A. Editing Models with Task Arithmetic . In ICLR , 2023

  18. [18]

    Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D

    Jiang, A. Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D. S., Casas, D. d. l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., Lavaud, L. R., Lachaux, M.-A., Stock, P., Scao, T. L., Lavril, T., Wang, T., Lacroix, T., and Sayed, W. E. Mistral 7B , 2023. URL https://arxiv.org/abs/2310.06825

  19. [19]

    Katkovnik, V. Ya . and Kulchitsky, O. Yu . Convergence of a Class of Random Search Algorithms . Automation and Remote Control, 33: 0 1321--1326, 1972

  20. [20]

    Lange, R. T. evosax: Jax-based evolution strategies, 2022. URL https://arxiv.org/abs/2212.04180

  21. [21]

    T., Schaul, T., Chen, Y., Zahavy, T., Dallibard, V., Lu, C., Singh, S., and Flennerhag, S

    Lange, R. T., Schaul, T., Chen, Y., Zahavy, T., Dallibard, V., Lu, C., Singh, S., and Flennerhag, S. Discovering Evolution Strategies via Meta - Black - Box Optimization . In ICLR , 2023. URL https://openreview.net/forum?id=mFDU0fP3EQH

  22. [22]

    Lee, H. K. and Yoon, S. W. Flat Reward in Policy Parameter Space Implies Robust Reinforcement Learning . In ICLR , 2025. URL https://openreview.net/forum?id=4OaO3GjP7k

  23. [23]

    Wizardmath: Empowering mathematical reasoning for large language models via reinforced evol-instruct

    Luo, H., Sun, Q., Xu, C., Zhao, P., Lou, J., Tao, C., Geng, X., Lin, Q., Chen, S., and Zhang, D. Wizardmath: Empowering mathematical reasoning for large language models via reinforced evol-instruct. In ICRL, 2025. URL https://openreview.net/forum?id=mMPMHWOdOy

  24. [24]

    R., Crisostomi, D., Santilli, A., and Rodolà, E

    Mencattini, T., Minut, A. R., Crisostomi, D., Santilli, A., and Rodolà, E. MERGE \ 3\ : Efficient Evolutionary Merging on Consumer -grade GPUs . In ICML, 2025. URL https://proceedings.mlr.press/v267/mencattini25a.html

  25. [25]

    D., Schoenholz, S

    Metz, L., Freeman, C. D., Schoenholz, S. S., and Kachman, T. Gradients are Not All You Need , 2022. URL http://arxiv.org/abs/2111.05803

  26. [26]

    and Spokoiny, V

    Nesterov, Y. and Spokoiny, V. Random gradient-free minimization of convex functions. Foundations of Computational Mathematics, 17 0 (2): 0 527--566, 2017

  27. [27]

    Information- Geometric Optimization Algorithms : A Unifying Picture via Invariance Principles

    Ollivier, Y., Arnold, L., Auger, A., and Hansen, N. Information- Geometric Optimization Algorithms : A Unifying Picture via Invariance Principles . JMLR, 2017. URL https://jmlr.org/papers/v18/14-467.html

  28. [28]

    A consensus-based model for global optimization and its mean-field limit

    Pinnau, R., Totzeck, C., Tse, O., and Martin, S. A consensus-based model for global optimization and its mean-field limit. Mathematical Models and Methods in Applied Sciences, 27 0 (1): 0 183--204, 2017

  29. [29]

    Evolutionsstrategie: Optimierung technischer Systeme nach Prinzipien der biologischen Evolution

    Rechenberg, I. Evolutionsstrategie: Optimierung technischer Systeme nach Prinzipien der biologischen Evolution . Problemata, 15. Frommann-Holzboog, Stuttgart-Bad Cannstatt, 1973. ISBN 978-3-7728-0373-4

  30. [30]

    Gradient is All You Need ?, 2023

    Riedl, K., Klock, T., Geldhauser, C., and Fornasier, M. Gradient is All You Need ?, 2023. URL https://arxiv.org/abs/2306.09778

  31. [31]

    How Consensus - Based Optimization can be Interpreted as a Stochastic Relaxation of Gradient Descent

    Riedl, K., Klock, T., Geldhauser, C., and Fornasier, M. How Consensus - Based Optimization can be Interpreted as a Stochastic Relaxation of Gradient Descent . In Differentiable Almost Everything Workshop , ICML 2024 , 2024

  32. [32]

    An automatic method for finding the greatest or least value of a function

    Rosenbrock, H. An automatic method for finding the greatest or least value of a function. The computer journal, 3 0 (3): 0 175--184, 1960

  33. [33]

    Rubinstein, R. Y. The score function approach for sensitivity analysis of computer simulation models. Mathematics and Computers in Simulation, 28 0 (5): 0 351--379, 1986

  34. [34]

    Evolution Strategies as a Scalable Alternative to Reinforcement Learning , 2017

    Salimans, T., Ho, J., Chen, X., Sidor, S., and Sutskever, I. Evolution Strategies as a Scalable Alternative to Reinforcement Learning , 2017. URL https://arxiv.org/abs/1703.03864

  35. [35]

    W., Tay, Y., Ruder, S., Zhou, D., Das, D., and Wei, J

    Shi, F., Suzgun, M., Freitag, M., Wang, X., Srivats, S., Vosoughi, S., Chung, H. W., Tay, Y., Ruder, S., Zhou, D., Das, D., and Wei, J. Language Models are Multilingual Chain -of- Thought Reasoners . In ICLR, 2023

  36. [36]

    A., Maheswaranathan, N., and Ganguli, S

    Sohl-Dickstein, J., Weiss, E. A., Maheswaranathan, N., and Ganguli, S. Deep Unsupervised Learning using Nonequilibrium Thermodynamics . In ICML , 2015

  37. [37]

    Denoising diffusion implicit models

    Song, J., Meng, C., and Ermon, S. Denoising diffusion implicit models. In ICLR , 2021 a

  38. [38]

    P., Kumar, A., Ermon, S., and Poole, B

    Song, Y., Sohl-Dickstein, J., Kingma, D. P., Kumar, A., Ermon, S., and Poole, B. Score- Based Generative Modeling through Stochastic Differential Equations . In ICLR, 2021 b

  39. [39]

    Spall, J. C. Introduction to Stochastic Search and Optimization: Estimation , Simulation, and Control . Wiley, Hoboken, NJ, 2003

  40. [40]

    Natural Evolution Strategies

    Wierstra, D., Schaul, T., Peters, J., and Schmidhuber, J. Natural Evolution Strategies . In 2008 IEEE Congress on Evolutionary Computation , pp.\ 3381--3387, 2008

  41. [41]

    Natural Evolution Strategies

    Wierstra, D., Schaul, T., Glasmachers, T., Sun, Y., Peters, J., and Schmidhuber, J. Natural Evolution Strategies . Journal of Machine Learning Research, 15 0 (27): 0 949--980, 2014

  42. [42]

    Y., Roelofs, R., Gontijo-Lopes, R., Morcos, A

    Wortsman, M., Ilharco, G., Gadre, S. Y., Roelofs, R., Gontijo-Lopes, R., Morcos, A. S., Namkoong, H., Farhadi, A., Carmon, Y., Kornblith, S., and Schmidt, L. Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time. In ICML, 2022

  43. [43]

    Hybrid of PSO and CMA - ES for Global Optimization

    Xu, P., Luo, W., Lin, X., Qiao, Y., and Zhu, T. Hybrid of PSO and CMA - ES for Global Optimization . In IEEE Congress on Evolutionary Computation , 2019

  44. [44]

    TIES - Merging : Resolving Interference When Merging Models

    Yadav, P., Tam, D., Choshen, L., Raffel, C., and Bansal, M. TIES - Merging : Resolving Interference When Merging Models . In NeurIPS , 2023

  45. [45]

    Language Models are Super Mario : Absorbing Abilities from Homologous Models as a Free Lunch

    Yu, L., Yu, B., Yu, H., Huang, F., and Li, Y. Language Models are Super Mario : Absorbing Abilities from Homologous Models as a Free Lunch . In ICML , 2024

  46. [46]

    and Sanderson, A

    Zhang, J. and Sanderson, A. C. JADE : Self -adaptive differential evolution with fast and reliable convergence performance. In 2007 IEEE Congress on Evolutionary Computation , 2007

  47. [47]

    Diffusion Models are Evolutionary Algorithms

    Zhang, Y., Hartl, B., Hazan, H., and Levin, M. Diffusion Models are Evolutionary Algorithms . In ICLR, 2025. URL https://openreview.net/forum?id=xVefsBbG2O