pith. sign in

arxiv: 2501.02200 · v2 · submitted 2025-01-04 · 💻 cs.NE · cs.AI· cs.CV· cs.LG

Learning Evolution via Optimization Knowledge Adaptation

Pith reviewed 2026-05-23 06:42 UTC · model grok-4.3

classification 💻 cs.NE cs.AIcs.CVcs.LG
keywords evolutionary algorithmsknowledge transferadaptive optimizationattention mechanismslearnable evolutionary algorithmssequential transfer optimizationprompt tuning
0
0 comments X

The pith

OKAEM parameterizes evolutionary operators with attention to support both pre-training for knowledge transfer and real-time self-tuning without priors.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces OKAEM to combine two goals that evolutionary algorithm research usually treats separately: transferring knowledge from past optimization runs and adapting operators on the fly using current search data. It does this by turning evolutionary operators into learnable functions through attention mechanisms that read historical populations and fitness values. Pre-training lets the model absorb knowledge from many prior tasks, while the adaptive phase updates parameters in real time when no earlier data exists. Experiments report that the same model beats specialized transfer methods across 12 scenarios and also exceeds other learnable evolutionary algorithms in settings with no prior knowledge. Readers would care because this offers one framework that reuses search experience more completely than current piecemeal approaches.

Core claim

The paper claims that by parameterizing evolutionary operators via attention mechanisms, OKAEM enables learnable update rules that facilitate the utilization of optimization knowledge via two phases: pre-training to integrate extensive prior knowledge for efficient transfer, and adaptive optimization to dynamically update parameters based on real-time knowledge. Experimental results confirm that OKAEM significantly outperforms state-of-the-art sequential transfer methods across 12 transfer scenarios via pre-training, and surpasses advanced learnable EAs solely through its self-tuning mechanism in prior-free settings.

What carries the argument

OKAEM, a unified learnable evolutionary framework that parameterizes evolutionary operators via attention mechanisms to adaptively update parameters from optimization knowledge in historical populations and fitness evaluations.

If this is right

  • Pre-training OKAEM allows efficient transfer of optimization knowledge to new tasks.
  • The self-tuning mechanism lets OKAEM exceed other learnable evolutionary algorithms when no prior data is supplied.
  • The framework demonstrates utility when applied to prompt tuning for vision-language models.
  • Ablation studies show the learnable components are required for the reported gains.
  • Visualization analyses indicate the model can autonomously discover interpretable evolutionary principles.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The attention parameterization could be tested on constrained or multi-objective problems to check whether the same transfer and adaptation benefits appear outside the current benchmarks.
  • If the model continues to surface interpretable principles, researchers might use it to generate new hypotheses about which operator designs succeed on particular problem classes.
  • Similar attention-based operator learning could be tried in other search paradigms such as local search or swarm methods to see whether the unification of transfer and adaptation generalizes.
  • One could measure wall-clock overhead of the attention updates during real-time adaptation to determine whether the performance gains remain practical under tight compute budgets.

Load-bearing premise

Parameterizing evolutionary operators via attention mechanisms will enable effective utilization of optimization knowledge in both the pre-training transfer phase and the real-time adaptive phase.

What would settle it

A head-to-head test in which OKAEM fails to outperform the state-of-the-art sequential transfer baselines on the 12 reported transfer scenarios, or fails to exceed advanced learnable EAs in prior-free runs, would falsify the performance claims.

read the original abstract

The iterative search process of evolutionary algorithms (EAs) encapsulates optimization knowledge within historical populations and fitness evaluations. Effective utilization of this knowledge is crucial for facilitating knowledge transfer and online adaptation. However, current research typically addresses these goals in isolation and faces distinct limitations: evolutionary sequential transfer optimization often suffers from incomplete utilization of prior knowledge, while adaptive strategies, utilizing real-time knowledge, are limited to tailoring specific evolutionary operators. To simultaneously achieve these two capabilities, we introduce the Optimization Knowledge Adaptation Evolutionary Model (OKAEM), a unified learnable evolutionary framework capable of adaptively updating parameters based on available optimization knowledge. By parameterizing evolutionary operators via attention mechanisms, OKAEM enables learnable update rules that facilitate the utilization of optimization knowledge via two phases: pre-training to integrate extensive prior knowledge for efficient transfer, and adaptive optimization to dynamically update parameters based on real-time knowledge. Experimental results confirm that OKAEM significantly outperforms state-of-the-art sequential transfer methods across 12 transfer scenarios via pre-training, and surpasses advanced learnable EAs solely through its self-tuning mechanism in prior-free settings. Beyond demonstrating practical utility in prompt tuning for vision-language models, ablation studies validate the necessity of the learnable components, while visualization analyses reveal the model's capacity to autonomously discover interpretable evolutionary principles. The code can be accessed at https://gitee.com/Anonymity_Paper/code-of-okaem.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes OKAEM, a unified learnable evolutionary framework that parameterizes evolutionary operators via attention mechanisms. This enables two phases of optimization knowledge utilization: pre-training on prior knowledge for sequential transfer, and real-time self-tuning based on current populations and fitness values. The central empirical claims are statistically significant outperformance versus state-of-the-art sequential transfer methods on 12 scenarios and versus other learnable EAs in prior-free settings, plus a demonstration on vision-language prompt tuning; ablations and visualizations are said to confirm component necessity and interpretable principles.

Significance. If the attention-based parameterization is shown to encode transferable optimization knowledge rather than merely adding capacity, the work would offer a single architecture bridging the historically separate literatures on evolutionary transfer optimization and online adaptive EAs. The public code release supports reproducibility and would allow the community to test generalization beyond the reported scenarios.

major comments (2)
  1. [Method section (architecture description)] The load-bearing claim that attention-based parameterization produces learnable update rules capable of integrating historical population/fitness data (both in pre-training and real-time phases) is not supported by a concrete architectural description or equations showing tokenization of populations, conditioning of operator parameters, or parameter sharing between phases. Without these details it is impossible to evaluate whether the reported gains are attributable to the claimed mechanism or to increased model capacity.
  2. [Experimental results section] The experimental claims of outperformance on 12 transfer scenarios and in prior-free settings rest on comparisons whose design (baseline implementations, statistical tests, number of runs, error bars) is not described in sufficient detail to assess robustness; the abstract asserts superiority but provides no quantitative tables or significance results that would allow verification of the headline numbers.
minor comments (2)
  1. [Abstract] The abstract states that visualizations reveal 'interpretable evolutionary principles' yet does not indicate which figures or analyses support this; a pointer to the relevant figure or subsection would improve clarity.
  2. [Method section] Notation for the attention-conditioned operators and the two-phase training procedure should be introduced with explicit equations rather than prose descriptions alone.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments, which highlight opportunities to strengthen the clarity of the architectural description and experimental reporting. We address each major comment below and will incorporate the requested details in the revised manuscript.

read point-by-point responses
  1. Referee: [Method section (architecture description)] The load-bearing claim that attention-based parameterization produces learnable update rules capable of integrating historical population/fitness data (both in pre-training and real-time phases) is not supported by a concrete architectural description or equations showing tokenization of populations, conditioning of operator parameters, or parameter sharing between phases. Without these details it is impossible to evaluate whether the reported gains are attributable to the claimed mechanism or to increased model capacity.

    Authors: We agree that the current method section would benefit from greater explicitness. The manuscript describes the high-level use of attention to parameterize operators and the two-phase knowledge utilization, but does not include the requested tokenization equations, conditioning details, or explicit parameter-sharing diagram. In the revision we will add (i) a formal tokenization procedure for population vectors and fitness values, (ii) the precise attention equations that produce operator-parameter updates, (iii) a diagram showing shared weights across pre-training and real-time phases, and (iv) an explicit statement that the architecture is designed to encode transferable optimization knowledge rather than merely increase capacity. These additions will allow readers to verify the mechanism. revision: yes

  2. Referee: [Experimental results section] The experimental claims of outperformance on 12 transfer scenarios and in prior-free settings rest on comparisons whose design (baseline implementations, statistical tests, number of runs, error bars) is not described in sufficient detail to assess robustness; the abstract asserts superiority but provides no quantitative tables or significance results that would allow verification of the headline numbers.

    Authors: We acknowledge that the experimental section currently lacks sufficient implementation and statistical detail. The manuscript states that OKAEM outperforms baselines on 12 scenarios and in prior-free settings and that results are statistically significant, yet it does not enumerate the exact baseline code versions, the precise statistical test (e.g., Wilcoxon), the number of independent runs, or include the corresponding tables with means, standard deviations, and p-values. In the revision we will (i) expand the experimental protocol subsection with these specifics, (ii) add a results table (or reference to supplementary material) containing the quantitative values and significance tests, and (iii) update the abstract to point readers to these tables. This will make the robustness claims verifiable. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical claims rest on experiments, not self-referential derivations

full rationale

The paper proposes OKAEM as an attention-parameterized evolutionary model with pre-training and adaptive phases. All load-bearing claims (outperformance on 12 transfer scenarios, superiority in prior-free settings) are presented as results of empirical comparisons, ablations, and visualizations. No equations, fitted parameters renamed as predictions, or self-citation chains appear in the abstract or described structure that reduce the central assertions to inputs by construction. The framework is self-contained against external benchmarks via reported experiments.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The abstract supplies no explicit free parameters, axioms, or invented physical entities; the framework rests on standard assumptions of evolutionary computation and attention-based learning.

pith-pipeline@v0.9.0 · 5791 in / 1118 out tokens · 30953 ms · 2026-05-23T06:42:16.297108+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

64 extracted references · 64 canonical work pages · 3 internal anchors

  1. [1]

    Nature 521(7553), 476–482 (2015)

    Eiben, A.E., Smith, J.: From evolutionary computation to the evolution of things. Nature 521(7553), 476–482 (2015)

  2. [2]

    Holland, J.H.: Outline for a logical theory of adaptive systems. J. ACM 9(3), 297–314 (1962)

  3. [3]

    Journal of Machine Learning Research 18(18), 1–65 (2017)

    Ollivier, Y., Arnold, L., Auger, A., Hansen, N.: Information-geometric optimiza- tion algorithms: A unifying picture via invariance principles. Journal of Machine Learning Research 18(18), 1–65 (2017)

  4. [4]

    Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (1998)

    Banzhaf, W., Nordin, P., Keller, R.E., Francone, F.D.: Genetic Programming: an Introduction: on the Automatic Evolution of Computer Programs and Its Applications. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (1998)

  5. [5]

    Nature Machine Intelligence 3(1), 9–15 (2021)

    Miikkulainen, R., Forrest, S.: A biological perspective on evolutionary computa- tion. Nature Machine Intelligence 3(1), 9–15 (2021)

  6. [6]

    Nature Machine Intelligence 1(1), 24–35 (2019)

    Stanley, K.O., Clune, J., Lehman, J., Miikkulainen, R.: Designing neural networks through neuroevolution. Nature Machine Intelligence 1(1), 24–35 (2019)

  7. [7]

    arXiv preprint arXiv:2412.11112 (2024)

    Yan, M., Wang, R., Liu, K.: Populating cellular metamaterials on the extrema of attainable elasticity through neuroevolution. arXiv preprint arXiv:2412.11112 (2024)

  8. [8]

    Science Robotics 9(97), 5888 (2024)

    Heuthe, V.-L., Panizon, E., Gu, H., Bechinger, C.: Counterfactual rewards promote collective transport using individually controlled swarm microrobots. Science Robotics 9(97), 5888 (2024)

  9. [9]

    Nature 610(7931), 277–282 (2022)

    Slade, P., Kochenderfer, M.J., Delp, S.L., Collins, S.H.: Personalizing exoskeleton assistance while walking in the real world. Nature 610(7931), 277–282 (2022)

  10. [10]

    Nature Machine Intelligence, 1–11 (2023)

    Li, B., Wei, Z., Wu, J., Yu, S., Zhang, T., Zhu, C., Zheng, D., Guo, W., Zhao, C., Zhang, J.: Machine learning-enabled globally guaranteed evolutionary computation. Nature Machine Intelligence, 1–11 (2023)

  11. [11]

    Nature, 1–3 (2023)

    Romera-Paredes, B., Barekatain, M., Novikov, A., Balog, M., Kumar, M.P., Dupont, E., Ruiz, F.J., Ellenberg, J.S., Wang, P., Fawzi, O., et al.: Mathematical discoveries from program search with large language models. Nature, 1–3 (2023)

  12. [12]

    Proceedings of the National Academy of Sciences 121(23), 2318641121 29 (2024)

    Zhang, S., Larsen, B., Sydnor, V.J., Zeng, T., An, L., Yan, X., Kong, R., Kong, X., Gur, R.C., Gur, R.E., Moore, T.M., Wolf, D.H., Holmes, A.J., Xie, Y., Zhou, J.H., Fortier, M.V., Tan, A.P., Gluckman, P., Chong, Y.S., Meaney, M.J., Deco, G., Satterthwaite, T.D., Yeo, B.T.T.: In vivo whole-cortex marker of excitation-inhibition ratio indexes cortical matu...

  13. [13]

    In: Proceedings of the Genetic and Evolutionary Computation Conference Companion

    De Jong, K.: Evolutionary computation: a unified approach. In: Proceedings of the Genetic and Evolutionary Computation Conference Companion. GECCO ’17, pp. 373–388. Association for Computing Machinery, New York, NY, USA (2017)

  14. [14]

    IEEE Computational Intelligence Magazine 16(1), 22–33 (2021)

    Tan, K.C., Feng, L., Jiang, M.: Evolutionary transfer optimization - a new frontier in evolutionary computation research. IEEE Computational Intelligence Magazine 16(1), 22–33 (2021)

  15. [15]

    In: Proceedings of the 17th ACM/SIGEVO Conference on Founda- tions of Genetic Algorithms

    Scott, E.O., De Jong, K.A.: First complexity results for evolutionary knowledge transfer. In: Proceedings of the 17th ACM/SIGEVO Conference on Founda- tions of Genetic Algorithms. FOGA ’23, pp. 140–151. Association for Computing Machinery, New York, NY, USA (2023)

  16. [16]

    IEEE Transactions on Evolutionary Computation 8(4), 316–328 (2004)

    Louis, S.J., McDonnell, J.: Learning with case-injected genetic algorithms. IEEE Transactions on Evolutionary Computation 8(4), 316–328 (2004)

  17. [17]

    IEEE Transactions on Evolutionary Computation 24(2), 201–216 (2020)

    Huang, C., Li, Y., Yao, X.: A survey of automatic parameter tuning methods for metaheuristics. IEEE Transactions on Evolutionary Computation 24(2), 201–216 (2020)

  18. [18]

    Proceedings of the AAAI Conference on Artificial Intelligence 35(10), 9188–9196 (2021)

    Nomura, M., Watanabe, S., Akimoto, Y., Ozaki, Y., Onishi, M.: Warm starting cma-es for hyperparameter optimization. Proceedings of the AAAI Conference on Artificial Intelligence 35(10), 9188–9196 (2021)

  19. [19]

    Knowledge-Based Systems 309, 112810 (2025)

    Wang, C., Zhao, J., Li, L., Jiao, L., Liu, F., Liu, X., Yang, S.: Knowledge-aware evolutionary graph neural architecture search. Knowledge-Based Systems 309, 112810 (2025)

  20. [20]

    IEEE Transactions on Evolutionary Computation 26(2), 304–318 (2022)

    Wang, C., Liu, J., Wu, K., Wu, Z.: Solving multitask optimization problems with adaptive knowledge transfer via anomaly detection. IEEE Transactions on Evolutionary Computation 26(2), 304–318 (2022)

  21. [21]

    IEEE Transactions on Evolutionary Computation 28(6), 1776–1793 (2024)

    Xue, X., Yang, C., Feng, L., Zhang, K., Song, L., Tan, K.C.: Solution transfer in evolutionary optimization: An empirical study on sequential transfer. IEEE Transactions on Evolutionary Computation 28(6), 1776–1793 (2024)

  22. [22]

    (eds.) Computing Machinery and Intelligence, pp

    Turing, A.M.: In: Epstein, R., Roberts, G., Beber, G. (eds.) Computing Machinery and Intelligence, pp. 23–65. Springer, Dordrecht (2009)

  23. [23]

    : Self-adaptation in genetic algorithms

    B¨ ack, T., et al. : Self-adaptation in genetic algorithms. In: Proceedings of the First European Conference on Artificial Life, pp. 263–271 (1992). MIT press Cambridge

  24. [24]

    Annals of 30 Operations Research 1 (1984)

    Schwefel, H.-P.: Evolution strategies: A family of non-linear optimization tech- niques based on imitating some principles of organic evolution. Annals of 30 Operations Research 1 (1984)

  25. [25]

    The CMA Evolution Strategy: A Tutorial

    Hansen, N.: The cma evolution strategy: A tutorial. arXiv preprint arXiv:1604.00772 (2016)

  26. [26]

    In: Proceedings of the Genetic and Evolutionary Computation Conference

    Lange, R.T., Schaul, T., Chen, Y., Lu, C., Zahavy, T., Dalibard, V., Flennerhag, S.: Discovering attention-based genetic algorithms via meta-black-box optimiza- tion. In: Proceedings of the Genetic and Evolutionary Computation Conference. GECCO ’23, pp. 929–937. Association for Computing Machinery, New York, NY, USA (2023)

  27. [27]

    In: The Eleventh International Conference on Learning Representations (2023)

    Lange, R.T., Schaul, T., Chen, Y., Zahavy, T., Dalibard, V., Lu, C., Singh, S., Flennerhag, S.: Discovering evolution strategies via meta-black-box optimization. In: The Eleventh International Conference on Learning Representations (2023)

  28. [28]

    In: Wallach, H., Larochelle, H., Beygelzimer, A., Alch´ e-Buc, F., Fox, E., Garnett, R

    Cao, Y., Chen, T., Wang, Z., Shen, Y.: Learning to optimize in swarms. In: Wallach, H., Larochelle, H., Beygelzimer, A., Alch´ e-Buc, F., Fox, E., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 32 (2019)

  29. [29]

    arXiv preprint arXiv:2312.06125 (2023)

    Hong, H., Jiang, M.: Pre-evolved model for complex multi-objective optimization problems. arXiv preprint arXiv:2312.06125 (2023)

  30. [30]

    arXiv preprint arXiv:2304.11787 (2023)

    Li, X., Wu, K., Zhang, X., Wang, H., Liu, J.: B2opt: Learning to optimize black- box optimization with little budget. arXiv preprint arXiv:2304.11787 (2023)

  31. [31]

    In: Guyon, I., Luxburg, U.V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., Garnett, R

    Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L.u., Polosukhin, I.: Attention is all you need. In: Guyon, I., Luxburg, U.V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 30 (2017)

  32. [32]

    In: Interna- tional Conference on Learning Representations (2019)

    Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. In: Interna- tional Conference on Learning Representations (2019)

  33. [33]

    arXiv preprint arXiv:2304.08503 (2023)

    Xue, X., Yang, C., Feng, L., Zhang, K., Song, L., Tan, K.C.: A scalable test problem generator for sequential transfer optimization. arXiv preprint arXiv:2304.08503 (2023)

  34. [34]

    In: Proceedings of ICNN’95 - International Conference on Neural Networks, vol

    Kennedy, J., Eberhart, R.: Particle swarm optimization. In: Proceedings of ICNN’95 - International Conference on Neural Networks, vol. 4, pp. 1942–19484 (1995)

  35. [35]

    Optimization Methods and Software 36(1), 114–144 (2021)

    Hansen, N., Auger, A., Ros, R., Mersmann, O., Tuˇ sar, T., Brockhoff, D.: Coco: a platform for comparing continuous optimizers in a black-box setting. Optimization Methods and Software 36(1), 114–144 (2021)

  36. [36]

    In: Proceedings 31 of the 12th Annual Conference Companion on Genetic and Evolutionary Compu- tation

    Hansen, N., Auger, A., Ros, R., Finck, S., Poˇ s´ ık, P.: Comparing results of 31 algo- rithms from the black-box optimization benchmarking bbob-2009. In: Proceedings 31 of the 12th Annual Conference Companion on Genetic and Evolutionary Compu- tation. GECCO ’10, pp. 1689–1696. Association for Computing Machinery, New York, NY, USA (2010)

  37. [37]

    Deep Neuroevolution: Genetic Algorithms Are a Competitive Alternative for Training Deep Neural Networks for Reinforcement Learning

    Such, F.P., Madhavan, V., Conti, E., Lehman, J., Stanley, K.O., Clune, J.: Deep neuroevolution: Genetic algorithms are a competitive alternative for training deep neural networks for reinforcement learning. arXiv preprint arXiv:1712.06567 (2017)

  38. [38]

    In: Proceedings of the Genetic and Evo- lutionary Computation Conference

    Kumar, A., Liu, B., Miikkulainen, R., Stone, P.: Effective mutation rate adap- tation through group elite selection. In: Proceedings of the Genetic and Evo- lutionary Computation Conference. GECCO ’22, pp. 721–729. Association for Computing Machinery, New York, NY, USA (2022)

  39. [39]

    Wierstra, D., Schaul, T., Glasmachers, T., Sun, Y., Peters, J., Schmidhuber, J.: Natural evolution strategies. J. Mach. Learn. Res. 15(1), 949–980 (2014)

  40. [40]

    In: Proceedings of the Com- panion Conference on Genetic and Evolutionary Computation

    Lange, R.T.: evosax: Jax-based evolution strategies. In: Proceedings of the Com- panion Conference on Genetic and Evolutionary Computation. GECCO ’23 Companion, pp. 659–662. Association for Computing Machinery, New York, NY, USA (2023)

  41. [41]

    In: Elkind, E

    Yu, L., Chen, Q., Lin, J., He, L.: Black-box prompt tuning for vision-language model as a service. In: Elkind, E. (ed.) Proceedings of the Thirty-Second Interna- tional Joint Conference on Artificial Intelligence, IJCAI-23, pp. 1686–1694 (2023). Main Track

  42. [42]

    In: Chaudhuri, K., Jegelka, S., Song, L., Szepesvari, C., Niu, G., Sabato, S

    Sun, T., Shao, Y., Qian, H., Huang, X., Qiu, X.: Black-box tuning for language- model-as-a-service. In: Chaudhuri, K., Jegelka, S., Song, L., Szepesvari, C., Niu, G., Sabato, S. (eds.) Proceedings of the 39th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 162, pp. 20841–20855 (2022)

  43. [43]

    In: Goldberg, Y., Kozareva, Z., Zhang, Y

    Sun, T., He, Z., Qian, H., Zhou, Y., Huang, X., Qiu, X.: BBTv2: Towards a gradient-free future with large language models. In: Goldberg, Y., Kozareva, Z., Zhang, Y. (eds.) Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pp. 3916–3930. Association for Computational Linguistics, Abu Dhabi, United Arab Emirates (2022)

  44. [44]

    arXiv preprint arXiv:2401.10510 (2024)

    Wang, C., Zhao, J., Jiao, L., Li, L., Liu, F., Yang, S.: When large language models meet evolutionary algorithms. arXiv preprint arXiv:2401.10510 (2024)

  45. [45]

    In: 2004 Conference on Computer Vision and Pattern Recognition Workshop, pp

    Fei-Fei, L., Fergus, R., Perona, P.: Learning generative visual models from few training examples: An incremental bayesian approach tested on 101 object categories. In: 2004 Conference on Computer Vision and Pattern Recognition Workshop, pp. 178–178 (2004). IEEE 32

  46. [46]

    In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp

    Parkhi, O.M., Vedaldi, A., Zisserman, A., Jawahar, C.: Cats and dogs. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 3498–3505 (2012). IEEE

  47. [47]

    In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) Workshops (2013)

    Krause, J., Stark, M., Deng, J., Fei-Fei, L.: 3d object representations for fine- grained categorization. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) Workshops (2013)

  48. [48]

    In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T

    Bossard, L., Guillaumin, M., Van Gool, L.: Food-101 – mining discriminative components with random forests. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) Computer Vision – ECCV 2014, pp. 446–461. Springer, Cham (2014)

  49. [49]

    UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild

    Soomro, K.: Ucf101: A dataset of 101 human actions classes from videos in the wild. arXiv preprint arXiv:1212.0402 (2012)

  50. [50]

    In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp

    Xiao, J., Hays, J., Ehinger, K.A., Oliva, A., Torralba, A.: Sun database: Large- scale scene recognition from abbey to zoo. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 3485–3492 (2010)

  51. [51]

    IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 12(7), 2217–2226 (2019)

    Helber, P., Bischke, B., Dengel, A., Borth, D.: Eurosat: A novel dataset and deep learning benchmark for land use and land cover classification. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 12(7), 2217–2226 (2019)

  52. [52]

    In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp

    Cimpoi, M., Maji, S., Kokkinos, I., Mohamed, S., Vedaldi, A.: Describing tex- tures in the wild. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 3606–3613 (2014)

  53. [53]

    In: Meila, M., Zhang, T

    Radford, A., Kim, J.W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J., Krueger, G., Sutskever, I.: Learning trans- ferable visual models from natural language supervision. In: Meila, M., Zhang, T. (eds.) Proceedings of the 38th International Conference on Machine Learning. Proceedings of Machine Learning Res...

  54. [54]

    In: Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence

    Qian, H., Hu, Y.-Q., Yu, Y.: Derivative-free optimization of high-dimensional non-convex functions by sequential random embeddings. In: Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence. IJCAI’16, pp. 1946–1952 (2016)

  55. [55]

    ACM Comput

    Tian, Y., Si, L., Zhang, X., Cheng, R., He, C., Tan, K.C., Jin, Y.: Evolutionary large-scale multi-objective optimization: A survey. ACM Comput. Surv. 54(8) (2021)

  56. [56]

    IEEE Transactions on Evolutionary Computation 6(2), 182–197 (2002) 33

    Deb, K., Pratap, A., Agarwal, S., Meyarivan, T.: A fast and elitist multiobjec- tive genetic algorithm: Nsga-ii. IEEE Transactions on Evolutionary Computation 6(2), 182–197 (2002) 33

  57. [57]

    IEEE Transactions on Evolutionary Computation 11(6), 712–731 (2007)

    Zhang, Q., Li, H.: Moea/d: A multiobjective evolutionary algorithm based on decomposition. IEEE Transactions on Evolutionary Computation 11(6), 712–731 (2007)

  58. [58]

    Neurocomputing 568, 127063 (2024)

    Su, J., Ahmed, M., Lu, Y., Pan, S., Bo, W., Liu, Y.: Roformer: Enhanced transformer with rotary position embedding. Neurocomputing 568, 127063 (2024)

  59. [59]

    : Simulated binary crossover for continuous search space

    Deb, K., Agrawal, R.B., et al. : Simulated binary crossover for continuous search space. Complex systems 9(2), 115–148 (1995)

  60. [60]

    : A combined genetic adaptive search (geneas) for engineering design

    Deb, K., Goyal, M., et al. : A combined genetic adaptive search (geneas) for engineering design. Computer Science and informatics 26, 30–45 (1996)

  61. [61]

    ACM Trans

    Meyerson, E., Nelson, M.J., Bradley, H., Gaier, A., Moradi, A., Hoover, A.K., Lehman, J.: Language model crossover: Variation through few-shot prompting. ACM Trans. Evol. Learn. Optim. 4(4) (2024)

  62. [62]

    (eds.) Evolution Through Large Models, pp

    Lehman, J., Gordon, J., Jain, S., Ndousse, K., Yeh, C., Stanley, K.O.: In: Banzhaf, W., Machado, P., Zhang, M. (eds.) Evolution Through Large Models, pp. 331–366. Springer, Singapore (2024)

  63. [63]

    In: The Twelfth International Conference on Learning Representations (2024)

    Yang, C., Wang, X., Lu, Y., Liu, H., Le, Q.V., Zhou, D., Chen, X.: Large lan- guage models as optimizers. In: The Twelfth International Conference on Learning Representations (2024)

  64. [64]

    International Journal of Computer Vision 130(9), 2337–2348 (2022) 34

    Zhou, K., Yang, J., Loy, C.C., Liu, Z.: Learning to prompt for vision-language models. International Journal of Computer Vision 130(9), 2337–2348 (2022) 34