Swarm Skills: A Portable, Self-Evolving Multi-Agent System Specification for Coordination Engineering

Deyang Li; Enrui Hu; Fangchao Liu; Hongbo Wang; Jianjun Tao; Qi Ye; Ruifeng Shi; Shuo Cheng; Xinyu Zhang; Xuefeng Jin

arxiv: 2605.10052 · v2 · pith:LYBM63DInew · submitted 2026-05-11 · 💻 cs.CL · cs.AI

Swarm Skills: A Portable, Self-Evolving Multi-Agent System Specification for Coordination Engineering

Xinyu Zhang , Zhicheng Dou , Deyang Li , Jianjun Tao , Shuo Cheng , Ruifeng Shi , Fangchao Liu , Enrui Hu

show 5 more authors

Yangkai Ding Hongbo Wang Qi Ye Xuefeng Jin Zhangchun Zhao

This is my paper

Pith reviewed 2026-05-19 17:46 UTC · model grok-4.3

classification 💻 cs.CL cs.AI

keywords multi-agent coordinationself-evolutionportable specificationcoordination engineeringagent workflowsprogressive disclosureswarm skillscollaboration assets

0 comments

The pith

Swarm Skills turns multi-agent coordination into portable, self-evolving assets without framework lock-in

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes Swarm Skills as a specification that packages multi-agent workflows into distributable assets consisting of roles, workflows, execution bounds, and a structure for ongoing improvement. It introduces an algorithm that automatically extracts successful execution patterns into new assets and updates existing ones using scores for effectiveness, utilization, and freshness. This targets the current situation where coordination methods stay trapped inside particular software systems, blocking easy sharing or autonomous refinement. If the approach holds, teams of agents could refine how they work together across different platforms with less manual effort over time.

Core claim

Swarm Skills extends skill standards with multi-agent semantics to create first-class distributable assets that include roles, workflows, execution bounds, and a semantic structure for self-evolution. A companion algorithm distills successful trajectories into new assets and patches existing ones through multi-dimensional scoring on effectiveness, utilization, and freshness, removing the need for human oversight during refinement. Architectural analysis and case studies demonstrate that this setup achieves zero-adapter cross-agent portability via progressive disclosure, so agent teams can evolve coordination strategies independently of any single framework.

What carries the argument

The Swarm Skills specification, which carries multi-agent semantics and a built-in semantic structure that supports automatic distillation and patching of coordination assets

If this is right

Multi-agent workflows become first-class, shareable assets that transfer between systems.
Coordination strategies improve autonomously through repeated distillation of execution data.
No framework-specific code or adapters are required for portability across agent teams.
Continuous patching occurs based on scores for effectiveness, utilization, and freshness.
Human intervention is no longer needed for refining collaboration protocols over time.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

This specification could support shared collections of coordination assets that communities refine collectively over time.
If the scoring method holds up, the same pattern might apply to evolving strategies in other multi-component systems such as distributed software or robotic teams.
Progressive disclosure as the portability mechanism suggests similar techniques could reduce lock-in in adjacent areas like workflow automation tools.

Load-bearing premise

The self-evolution algorithm can reliably distill and patch coordination strategies without human oversight or performance degradation over repeated cycles.

What would settle it

A test that runs the self-evolution algorithm through many cycles on the same set of multi-agent tasks while tracking whether coordination performance steadily improves, stays stable, or declines without any external corrections.

Figures

Figures reproduced from arXiv: 2605.10052 by Deyang Li, Enrui Hu, Fangchao Liu, Hongbo Wang, Jianjun Tao, Qi Ye, Ruifeng Shi, Shuo Cheng, Xinyu Zhang, Xuefeng Jin, Yangkai Ding, Zhangchun Zhao, Zhicheng Dou.

**Figure 2.** Figure 2: The Anatomy of a Swarm Skill. The specification delineates the asset into three [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗

**Figure 3.** Figure 3: The Self-Evolution Lifecycle. The algorithm orchestrates a continuous loop start [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗

**Figure 4.** Figure 4: The overall collaboration workflow of the [PITH_FULL_IMAGE:figures/full_fig_p011_4.png] view at source ↗

read the original abstract

As artificial intelligence engineering paradigms shift from single-agent Prompt and Context Engineering toward multi-agent \textbf{Coordination Engineering}, the ability to codify and systematically improve how multiple agents collaborate has emerged as a critical bottleneck. While single-agent skills can now be distributed as portable assets, multi-agent coordination protocols remain locked within framework-internal code or static configurations, preventing them from being shared across systems or autonomously improved over time. We propose \textbf{Swarm Skills}, a portable specification that extends the Anthropic Skills standard with multi-agent semantics. Swarm Skills turns multi-agent workflows into first-class, distributable assets that consist of roles, workflows, execution bounds, and a built-in semantic structure for self-evolution. To operationalize the specification's evolving nature, we present a companion self-evolution algorithm that automatically distills successful execution trajectories into new Swarm Skills and continuously patches existing ones based on multi-dimensional scoring (Effectiveness, Utilization, and Freshness), eliminating the need for human-in-the-loop oversight during the refinement process. Through an architectural compatibility analysis and a comprehensive qualitative case study using the open-source JiuwenSwarm reference implementation, we demonstrate how Swarm Skills achieves zero-adapter cross-agent portability via progressive disclosure, enabling agent teams to self-evolve their coordination strategies without framework lock-in.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Swarm Skills proposes a portable multi-agent spec with self-evolution but rests its main claims on one qualitative case study without metrics on improvement or drift.

read the letter

The main takeaway is that this paper puts forward a specification for packaging multi-agent coordination as shareable, self-updating assets that extend the single-agent skills format. It adds roles, workflows, execution bounds, and a built-in loop for distilling trajectories into new skills or patching old ones using scores on Effectiveness, Utilization, and Freshness. The reference implementation in JiuwenSwarm is meant to show zero-adapter portability through progressive disclosure. That combination is new enough to stand out from prior single-agent work. The architectural compatibility section is clear and the case study gives a practical walkthrough of how the pieces fit together in one system. If you are building agent teams and want to avoid framework lock-in, the spec details and the outline of the evolution algorithm are worth a look. The soft spot is the evidence for the self-evolution part. The paper supports the reliability claim only through architectural description and a single qualitative example. There are no repeated-cycle numbers, no baseline comparisons, and no checks for whether the scoring and patching loop actually improves performance or starts to degrade after a few rounds. The assumption that the three scoring dimensions will guide stable, unsupervised improvement is stated but not measured. This is a real gap for a proposal whose central selling point is autonomous refinement. The work is aimed at engineers and researchers who design or deploy multi-agent systems and care about modularity and long-term maintainability. A reader who wants concrete ideas for coordination standards will get usable material from the format definition and the reference code. It deserves a serious referee because the problem it targets is current and the spec itself is a substantive extension rather than a minor tweak. I would send it out for review so the community can test the evolution loop and suggest concrete evaluation steps.

Referee Report

2 major / 2 minor

Summary. The paper proposes Swarm Skills, a portable specification extending the Anthropic Skills standard with multi-agent semantics including roles, workflows, execution bounds, and built-in support for self-evolution. It introduces a companion algorithm that distills successful execution trajectories into new skills and patches existing ones using multi-dimensional scoring on Effectiveness, Utilization, and Freshness to enable autonomous refinement without human oversight. The central claims of zero-adapter cross-agent portability via progressive disclosure and reliable self-evolution are supported through an architectural compatibility analysis and a qualitative case study on the JiuwenSwarm reference implementation.

Significance. If the self-evolution loop can be shown to produce stable or improving coordination strategies without degradation, the work would be significant for coordination engineering by turning multi-agent protocols into distributable, framework-independent assets. This directly addresses the bottleneck of locked-in coordination code and could enable broader sharing and iterative improvement of agent teams.

major comments (2)

[Case Study] Case Study section: The qualitative case study on JiuwenSwarm demonstrates initial application of Swarm Skills but provides no repeated-cycle metrics, baseline comparisons, or failure-mode analysis to support the claim that the multi-dimensional scoring and distillation/patching loop produces stable coordination strategies without quality drift or degradation over time; this is load-bearing for the autonomy claim.
[Self-evolution algorithm] Self-evolution algorithm description: The scoring dimensions (Effectiveness, Utilization, Freshness) are introduced as drivers for distillation and patching, yet the manuscript does not specify how these scores are computed from trajectories or include any formal definition, pseudocode, or sensitivity analysis, leaving the reliability of the no-human-oversight claim under-supported.

minor comments (2)

[Abstract] Abstract: The phrase 'zero-adapter cross-agent portability' is used without an immediate definition or concrete example of what progressive disclosure entails in practice.
[Specification] Notation: The manuscript would benefit from a table summarizing the components of a Swarm Skill (roles, workflows, bounds, evolution structure) for clarity.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their thoughtful and constructive review. The comments highlight important areas where additional detail and evidence can strengthen the manuscript's support for the self-evolution claims. We address each major comment below and commit to revisions that directly respond to the concerns raised.

read point-by-point responses

Referee: [Case Study] Case Study section: The qualitative case study on JiuwenSwarm demonstrates initial application of Swarm Skills but provides no repeated-cycle metrics, baseline comparisons, or failure-mode analysis to support the claim that the multi-dimensional scoring and distillation/patching loop produces stable coordination strategies without quality drift or degradation over time; this is load-bearing for the autonomy claim.

Authors: We agree that the existing qualitative case study is insufficient to fully substantiate the stability and lack of degradation in the self-evolution loop. In the revised manuscript we will expand the Case Study section with quantitative metrics collected over multiple repeated cycles, direct comparisons to non-evolving baseline multi-agent configurations, and explicit failure-mode analysis (including monitoring for quality drift). These additions will provide stronger empirical grounding for the autonomy claim. revision: yes
Referee: [Self-evolution algorithm] Self-evolution algorithm description: The scoring dimensions (Effectiveness, Utilization, Freshness) are introduced as drivers for distillation and patching, yet the manuscript does not specify how these scores are computed from trajectories or include any formal definition, pseudocode, or sensitivity analysis, leaving the reliability of the no-human-oversight claim under-supported.

Authors: We acknowledge that the current description introduces the three scoring dimensions without sufficient implementation detail. The revised version will include a dedicated subsection that provides formal mathematical definitions for Effectiveness, Utilization, and Freshness, describes their exact computation from execution trajectories, supplies pseudocode for the full distillation and patching procedure, and reports a sensitivity analysis demonstrating robustness across different parameter settings. These additions will directly support the reliability of the no-human-oversight claim. revision: yes

Circularity Check

0 steps flagged

No significant circularity in Swarm Skills specification proposal

full rationale

The paper introduces a new specification extending Anthropic Skills with multi-agent semantics and describes a companion self-evolution algorithm using internally defined scoring dimensions (Effectiveness, Utilization, Freshness). Main claims of zero-adapter portability and autonomous refinement are supported by architectural compatibility analysis plus one qualitative case study on the JiuwenSwarm implementation. No equations, fitted parameters, or load-bearing self-citations appear that would reduce any result to its inputs by construction. The work is a design proposal whose derivation chain remains self-contained without the enumerated circular patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The proposal rests on the assumption that multi-agent coordination can be usefully externalized into a portable format and that automatic scoring of trajectories will produce net-positive refinements without external validation.

axioms (1)

domain assumption Multi-agent collaboration benefits from explicit, shareable coordination protocols beyond single-agent prompting.
Invoked throughout the motivation and design sections as the core premise for moving to Coordination Engineering.

invented entities (1)

Swarm Skills no independent evidence
purpose: Portable specification containing roles, workflows, execution bounds, and self-evolution semantics for multi-agent systems.
Newly defined construct introduced in this work; no independent evidence outside the paper is provided.

pith-pipeline@v0.9.0 · 5800 in / 1364 out tokens · 56683 ms · 2026-05-19T17:46:00.904828+00:00 · methodology

Review history (2 revisions) →

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Swarm Skills turns multi-agent workflows into first-class, distributable assets that consist of roles, workflows, execution bounds, and a built-in semantic structure for self-evolution... multi-dimensional scoring (Effectiveness, Utilization, and Freshness)
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

The total score S_i for an Evolution Record i is defined as a weighted composite of three metrics: Effectiveness (E), Utilization (U), and Freshness (F)

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

16 extracted references · 16 canonical work pages

[1]

AgentVerse: Facilitating multi-agent collaboration and ex- ploring emergent behaviors

Weize Chen, Yusheng Su, Jingwei Zuo, Cheng Yang, Chenfei Yuan, Chi-Min Chan, Heyang Yu, Yaxi Lu, Yi-Hsin Hung, Chen Qian, Yujia Qin, Xin Cong, Ruobing Xie, Zhiyuan Liu, Maosong Sun, and Jie Zhou. AgentVerse: Facilitating multi-agent collaboration and ex- ploring emergent behaviors. InThe Twelfth International Conference on Learning Repre- sentations, 2024

work page 2024
[2]

MetaGPT: Meta programming for a multi-agent collaborative framework

Sirui Hong, Mingchen Zhuge, Jonathan Chen, Xiawu Zheng, Yuheng Cheng, Jinlin Wang, Ceyao Zhang, Zili Wang, Steven Ka Shing Yau, Zijuan Lin, Liyang Zhou, Chenyu Ran, Lingfeng Xiao, Chenglin Wu, and J¨ urgen Schmidhuber. MetaGPT: Meta programming for a multi-agent collaborative framework. InThe Twelfth International Conference on Learning Representations, 2024

work page 2024
[3]

CAMEL: Communicative agents for “mind” exploration of large language model society

Guohao Li, Hasan Abed Al Kader Hammoud, Hani Itani, Dmitrii Khizbullin, and Bernard Ghanem. CAMEL: Communicative agents for “mind” exploration of large language model society. InThirty-seventh Conference on Neural Information Processing Systems, 2023

work page 2023
[4]

SAGE: Self-evolving agents with reflective and memory-augmented abilities

Xuechen Liang, Yangfan He, Yinghui Xia, Xinyuan Song, Jianhui Wang, Meiling Tao, Li Sun, Xinhang Yuan, Jiayi Su, Keqin Li, Jiaqi Chen, Jinsong Yang, Siyuan Chen, and Tianyu Shi. SAGE: Self-evolving agents with reflective and memory-augmented abilities. Neurocomputing, 2025

work page 2025
[5]

A dynamic LLM-powered agent network for task-oriented agent collaboration

Zijun Liu, Yanzhe Zhang, Peng Li, Yang Liu, and Diyi Yang. A dynamic LLM-powered agent network for task-oriented agent collaboration. InFirst Conference on Language Modeling, 2024

work page 2024
[6]

O’Brien, Carrie J

Joon Sung Park, Joseph C. O’Brien, Carrie J. Cai, Meredith Ringel Morris, Percy Liang, and Michael S. Bernstein. Generative agents: Interactive simulacra of human behavior. In Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technol- ogy (UIST), pages 1–22, 2023

work page 2023
[7]

ChatDev: Communicative agents for software development

Chen Qian, Wei Liu, Hongzhang Liu, Nuo Chen, Yufan Dang, Jiahao Li, Cheng Yang, Weize Chen, Yusheng Su, Xin Cong, Juyuan Xu, Dahai Li, Zhiyuan Liu, and Maosong Sun. ChatDev: Communicative agents for software development. InProceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (ACL), 2024

work page 2024
[8]

ToolLLM: Facilitating large language models to master 16000+ real-world apis

Yujia Qin, Shihao Liang, Yining Ye, Kunlun Zhu, Lan Yan, Yaxi Lu, Yankai Lin, Xin Cong, Xiangru Tang, Bill Qian, Sihan Zhao, Lauren Hong, Runchu Tian, Ruobing Xie, Jie Zhou, Mark Gerstein, Dahai Li, Zhiyuan Liu, and Maosong Sun. ToolLLM: Facilitating large language models to master 16000+ real-world apis. InThe Twelfth International Conference on Learning...

work page 2024
[9]

HuggingGPT: Solving AI tasks with ChatGPT and its friends in Hugging Face

Yongliang Shen, Kaitao Song, Xu Tan, Dongsheng Li, Weiming Lu, and Yueting Zhuang. HuggingGPT: Solving AI tasks with ChatGPT and its friends in Hugging Face. InAdvances in Neural Information Processing Systems, volume 36, 2023

work page 2023
[10]

Voyager: An open-ended embodied agent with large lan- guage models.Trans

Guanzhi Wang, Yuqi Xie, Yunfan Jiang, Ajay Mandlekar, Chaowei Xiao, Yuke Zhu, Linxi Fan, and Anima Anandkumar. Voyager: An open-ended embodied agent with large lan- guage models.Trans. Mach. Learn. Res., 2024, 2024

work page 2024
[11]

Autogen: Enabling next-gen llm applications via multi- agent conversation

Qingyun Wu, Gagan Bansal, Jieyu Zhang, Yiran Wu, Beibin Li, Erkang Zhu, Li Jiang, Xiaoyun Zhang, Shaokun Zhang, Jiale Liu, Ahmed Hassan Awadallah, Ryen W White, Doug Burger, and Chi Wang. Autogen: Enabling next-gen llm applications via multi- agent conversation. InFirst Conference on Language Modeling, 2024. 13

work page 2024
[12]

Narasimhan, and Yuan Cao

Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik R. Narasimhan, and Yuan Cao. ReAct: Synergizing reasoning and acting in language models. InThe Eleventh International Conference on Learning Representations, 2023

work page 2023
[13]

EvoA- gent: Towards automatic multi-agent generation via evolutionary algorithms

Siyu Yuan, Kaitao Song, Jiangjie Chen, Xu Tan, Dongsheng Li, and Deqing Yang. EvoA- gent: Towards automatic multi-agent generation via evolutionary algorithms. InProceed- ings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics (NAACL), 2025

work page 2025
[14]

AFlow: Automating agentic workflow generation

Jiayi Zhang, Jinyu Xiang, Zhaoyang Yu, Fengwei Teng, Xionghui Chen, Jiaqi Chen, Mingchen Zhuge, Xin Cheng, Sirui Hong, Jinlin Wang, Bingnan Zheng, Bang Liu, Yuyu Luo, and Chenglin Wu. AFlow: Automating agentic workflow generation. InThe Thir- teenth International Conference on Learning Representations, 2025

work page 2025
[15]

ExpeL: LLM agents are experiential learners

Andrew Zhao, Daniel Huang, Quentin Xu, Matthieu Lin, Yong-Jin Liu, and Gao Huang. ExpeL: LLM agents are experiential learners. InProceedings of the AAAI Conference on Artificial Intelligence, 2024

work page 2024
[16]

GPTSwarm: Language agents as optimizable graphs

Mingchen Zhuge, Wenyi Wang, Louis Kirsch, Francesco Faccio, Dmitrii Khizbullin, and J¨ urgen Schmidhuber. GPTSwarm: Language agents as optimizable graphs. InForty-first International Conference on Machine Learning, 2024. A Author List Core Contributors.Xinyu Zhang, Zhicheng Dou, Deyang Li, Jianjun Tao, Shuo Cheng, Ruifeng Shi, Fangchao Liu, Enrui Hu, Yang...

work page 2024

[1] [1]

AgentVerse: Facilitating multi-agent collaboration and ex- ploring emergent behaviors

Weize Chen, Yusheng Su, Jingwei Zuo, Cheng Yang, Chenfei Yuan, Chi-Min Chan, Heyang Yu, Yaxi Lu, Yi-Hsin Hung, Chen Qian, Yujia Qin, Xin Cong, Ruobing Xie, Zhiyuan Liu, Maosong Sun, and Jie Zhou. AgentVerse: Facilitating multi-agent collaboration and ex- ploring emergent behaviors. InThe Twelfth International Conference on Learning Repre- sentations, 2024

work page 2024

[2] [2]

MetaGPT: Meta programming for a multi-agent collaborative framework

Sirui Hong, Mingchen Zhuge, Jonathan Chen, Xiawu Zheng, Yuheng Cheng, Jinlin Wang, Ceyao Zhang, Zili Wang, Steven Ka Shing Yau, Zijuan Lin, Liyang Zhou, Chenyu Ran, Lingfeng Xiao, Chenglin Wu, and J¨ urgen Schmidhuber. MetaGPT: Meta programming for a multi-agent collaborative framework. InThe Twelfth International Conference on Learning Representations, 2024

work page 2024

[3] [3]

CAMEL: Communicative agents for “mind” exploration of large language model society

Guohao Li, Hasan Abed Al Kader Hammoud, Hani Itani, Dmitrii Khizbullin, and Bernard Ghanem. CAMEL: Communicative agents for “mind” exploration of large language model society. InThirty-seventh Conference on Neural Information Processing Systems, 2023

work page 2023

[4] [4]

SAGE: Self-evolving agents with reflective and memory-augmented abilities

Xuechen Liang, Yangfan He, Yinghui Xia, Xinyuan Song, Jianhui Wang, Meiling Tao, Li Sun, Xinhang Yuan, Jiayi Su, Keqin Li, Jiaqi Chen, Jinsong Yang, Siyuan Chen, and Tianyu Shi. SAGE: Self-evolving agents with reflective and memory-augmented abilities. Neurocomputing, 2025

work page 2025

[5] [5]

A dynamic LLM-powered agent network for task-oriented agent collaboration

Zijun Liu, Yanzhe Zhang, Peng Li, Yang Liu, and Diyi Yang. A dynamic LLM-powered agent network for task-oriented agent collaboration. InFirst Conference on Language Modeling, 2024

work page 2024

[6] [6]

O’Brien, Carrie J

Joon Sung Park, Joseph C. O’Brien, Carrie J. Cai, Meredith Ringel Morris, Percy Liang, and Michael S. Bernstein. Generative agents: Interactive simulacra of human behavior. In Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technol- ogy (UIST), pages 1–22, 2023

work page 2023

[7] [7]

ChatDev: Communicative agents for software development

Chen Qian, Wei Liu, Hongzhang Liu, Nuo Chen, Yufan Dang, Jiahao Li, Cheng Yang, Weize Chen, Yusheng Su, Xin Cong, Juyuan Xu, Dahai Li, Zhiyuan Liu, and Maosong Sun. ChatDev: Communicative agents for software development. InProceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (ACL), 2024

work page 2024

[8] [8]

ToolLLM: Facilitating large language models to master 16000+ real-world apis

Yujia Qin, Shihao Liang, Yining Ye, Kunlun Zhu, Lan Yan, Yaxi Lu, Yankai Lin, Xin Cong, Xiangru Tang, Bill Qian, Sihan Zhao, Lauren Hong, Runchu Tian, Ruobing Xie, Jie Zhou, Mark Gerstein, Dahai Li, Zhiyuan Liu, and Maosong Sun. ToolLLM: Facilitating large language models to master 16000+ real-world apis. InThe Twelfth International Conference on Learning...

work page 2024

[9] [9]

HuggingGPT: Solving AI tasks with ChatGPT and its friends in Hugging Face

Yongliang Shen, Kaitao Song, Xu Tan, Dongsheng Li, Weiming Lu, and Yueting Zhuang. HuggingGPT: Solving AI tasks with ChatGPT and its friends in Hugging Face. InAdvances in Neural Information Processing Systems, volume 36, 2023

work page 2023

[10] [10]

Voyager: An open-ended embodied agent with large lan- guage models.Trans

Guanzhi Wang, Yuqi Xie, Yunfan Jiang, Ajay Mandlekar, Chaowei Xiao, Yuke Zhu, Linxi Fan, and Anima Anandkumar. Voyager: An open-ended embodied agent with large lan- guage models.Trans. Mach. Learn. Res., 2024, 2024

work page 2024

[11] [11]

Autogen: Enabling next-gen llm applications via multi- agent conversation

Qingyun Wu, Gagan Bansal, Jieyu Zhang, Yiran Wu, Beibin Li, Erkang Zhu, Li Jiang, Xiaoyun Zhang, Shaokun Zhang, Jiale Liu, Ahmed Hassan Awadallah, Ryen W White, Doug Burger, and Chi Wang. Autogen: Enabling next-gen llm applications via multi- agent conversation. InFirst Conference on Language Modeling, 2024. 13

work page 2024

[12] [12]

Narasimhan, and Yuan Cao

Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik R. Narasimhan, and Yuan Cao. ReAct: Synergizing reasoning and acting in language models. InThe Eleventh International Conference on Learning Representations, 2023

work page 2023

[13] [13]

EvoA- gent: Towards automatic multi-agent generation via evolutionary algorithms

Siyu Yuan, Kaitao Song, Jiangjie Chen, Xu Tan, Dongsheng Li, and Deqing Yang. EvoA- gent: Towards automatic multi-agent generation via evolutionary algorithms. InProceed- ings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics (NAACL), 2025

work page 2025

[14] [14]

AFlow: Automating agentic workflow generation

Jiayi Zhang, Jinyu Xiang, Zhaoyang Yu, Fengwei Teng, Xionghui Chen, Jiaqi Chen, Mingchen Zhuge, Xin Cheng, Sirui Hong, Jinlin Wang, Bingnan Zheng, Bang Liu, Yuyu Luo, and Chenglin Wu. AFlow: Automating agentic workflow generation. InThe Thir- teenth International Conference on Learning Representations, 2025

work page 2025

[15] [15]

ExpeL: LLM agents are experiential learners

Andrew Zhao, Daniel Huang, Quentin Xu, Matthieu Lin, Yong-Jin Liu, and Gao Huang. ExpeL: LLM agents are experiential learners. InProceedings of the AAAI Conference on Artificial Intelligence, 2024

work page 2024

[16] [16]

GPTSwarm: Language agents as optimizable graphs

Mingchen Zhuge, Wenyi Wang, Louis Kirsch, Francesco Faccio, Dmitrii Khizbullin, and J¨ urgen Schmidhuber. GPTSwarm: Language agents as optimizable graphs. InForty-first International Conference on Machine Learning, 2024. A Author List Core Contributors.Xinyu Zhang, Zhicheng Dou, Deyang Li, Jianjun Tao, Shuo Cheng, Ruifeng Shi, Fangchao Liu, Enrui Hu, Yang...

work page 2024