Recognition: no theorem link
A Comprehensive Survey on Agent Skills: Taxonomy, Techniques, and Applications
Pith reviewed 2026-05-11 01:44 UTC · model grok-4.3
The pith
Agent skills defined as reusable procedures are key to scalable and maintainable LLM agents.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Agent skills are reusable procedural artifacts that coordinate tools, memory, and runtime context under task-specific constraints. They complement agents, which handle high-level reasoning and planning, by forming the operational layer for reliable, reusable, and composable execution. The survey structures the field around a four-stage lifecycle—representation, acquisition, retrieval, and evolution—and surveys methods, resources, and applications in each while noting open problems in quality control, safe updating, and capability management.
What carries the argument
The agent skill, defined as a reusable procedural artifact coordinating tools, memory, and context, which serves as the operational layer complementing agent reasoning.
If this is right
- Skills enable more efficient task execution by avoiding repeated low-level tool calls.
- The lifecycle stages provide a structured way to develop and improve agent capabilities over time.
- Composability of skills supports building complex workflows from simpler components.
- Community resources like collected repositories accelerate progress across the field.
Where Pith is reading between the lines
- Adopting this skill-based approach could lead to standardized skill libraries similar to software libraries.
- Future work might explore how skills evolve in multi-agent systems or over long deployment periods.
- Testing the taxonomy on emerging agent frameworks could reveal needed refinements.
Load-bearing premise
The proposed definition of agent skills as reusable procedural artifacts and the division of the literature into four lifecycle stages captures the essential challenges without overlooking significant approaches.
What would settle it
Discovery of a prominent agent technique or system that does not fit into any of the four stages of representation, acquisition, retrieval, or evolution, or that does not benefit from reusable skills, would undermine the framework.
Figures
read the original abstract
Large language model (LLM)-based agents that reason, plan, and act through tools, memory, and structured interaction are emerging as a promising paradigm for automating complex workflows. Recent systems such as OpenClaw and Claude Code exemplify a broader shift from passive response generation to action-oriented task execution. Yet as agents move toward open-ended, real-world deployment, relying on from-scratch reasoning and low-level tool calls for every task become increasingly inefficient, error-prone, and hard to maintain. This survey examines this challenge through the lens of \emph{agent skills}, which we define as reusable procedural artifacts that coordinate tools, memory, and runtime context under task-specific constraints. Under this view, agents and skills play complementary roles: agents handle high-level reasoning and planning, while skills form the operational layer that enables reliable, reusable, and composable execution. Skills are therefore central to the scalability, robustness, and maintainability of modern agent systems. We organize the literature around four stages of the agent skill lifecycle -- representation, acquisition, retrieval, and evolution -- and review representative methods, ecosystem resources, and application settings across each stage. We conclude by discussing open challenges in quality control, interoperability, safe updating, and long-term capability management. All related resources, including research papers, open-source data, and projects, are collected for the community in \textcolor{blue}{https://github.com/JayLZhou/Awesome-Agent-Skills}.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper surveys LLM-based agents and defines agent skills as reusable procedural artifacts that coordinate tools, memory, and runtime context under task-specific constraints. It positions skills as complementary to high-level agent reasoning and planning, arguing they are central to scalability, robustness, and maintainability. The literature is organized around a four-stage skill lifecycle (representation, acquisition, retrieval, evolution), with reviews of methods, ecosystem resources, and applications in each stage, plus discussion of open challenges in quality control, interoperability, safe updating, and long-term capability management. A GitHub repository collects related papers, data, and projects.
Significance. If the taxonomy holds without major omissions, the survey would provide a practical organizing framework for an emerging sub-area of agent systems, helping researchers navigate techniques for reusable execution components. The curated resource collection adds immediate community value beyond the taxonomy itself.
major comments (1)
- [Abstract] Abstract and introduction: The four-stage lifecycle is presented as the central organizing lens, but the manuscript provides limited justification for why representation, acquisition, retrieval, and evolution are exhaustive or optimal compared to alternatives (e.g., adding explicit execution or evaluation stages). This choice is load-bearing for the survey's utility and should be defended with reference to gaps in prior taxonomies.
minor comments (3)
- [Abstract] Abstract: The examples 'OpenClaw' and 'Claude Code' are introduced without citations or brief descriptions; add references or short characterizations to ground the shift from passive to action-oriented agents.
- The GitHub link is highlighted in blue text; ensure the repository is complete, versioned, and includes DOIs or stable links for all collected resources in the final manuscript.
- The definition of skills as 'reusable procedural artifacts' is introduced early; a dedicated subsection comparing it to related concepts (tools, workflows, APIs) would reduce potential overlap with existing agent literature.
Simulated Author's Rebuttal
We thank the referee for the positive assessment and recommendation for minor revision. We address the major comment below.
read point-by-point responses
-
Referee: [Abstract] Abstract and introduction: The four-stage lifecycle is presented as the central organizing lens, but the manuscript provides limited justification for why representation, acquisition, retrieval, and evolution are exhaustive or optimal compared to alternatives (e.g., adding explicit execution or evaluation stages). This choice is load-bearing for the survey's utility and should be defended with reference to gaps in prior taxonomies.
Authors: We agree that additional explicit justification would improve the manuscript. The four stages were chosen to capture the end-to-end lifecycle of skills viewed as reusable procedural artifacts that complement agent-level reasoning: representation formalizes skill structure for interoperability; acquisition encompasses creation via learning, synthesis, or curation; retrieval covers selection and invocation mechanisms during task execution; and evolution addresses adaptation, refinement, and long-term management. Execution is treated as an agent-runtime concern rather than a skill-lifecycle stage, while evaluation is subsumed under acquisition (initial validation) and evolution (ongoing quality control and safe updating). This framing fills a gap in prior taxonomies, which typically organize around agent architectures, planning algorithms, or tool ecosystems but do not isolate a dedicated, reusable skill layer with its own lifecycle stages. We will revise the introduction to add a dedicated paragraph comparing our taxonomy to alternatives and citing specific omissions in existing LLM-agent surveys. revision: yes
Circularity Check
No significant circularity in taxonomy survey
full rationale
This paper is a literature survey that defines agent skills as reusable procedural artifacts and organizes existing work into a four-stage lifecycle (representation, acquisition, retrieval, evolution) as an organizing lens. No equations, predictions, fitted parameters, or derivations appear anywhere in the manuscript. The central claims follow directly from the proposed definitional framing and are supported by review of external literature rather than any self-referential reduction or self-citation chain. The structure is self-contained as a taxonomy exercise with no load-bearing steps that collapse to inputs by construction.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption LLM-based agents that rely on from-scratch reasoning for every task become inefficient, error-prone, and hard to maintain in real-world deployment.
invented entities (1)
-
Agent skills
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Language Models are Few-Shot Learners
T. B. Brownet al., “Language models are few-shot learners,” inAdvances in Neural Information Processing Systems, vol. 33. Curran Associates, Inc., 2020, pp. 1877–1901. [Online]. Available: https://arxiv.org/abs/2005.14165
work page internal anchor Pith review Pith/arXiv arXiv 2020
-
[2]
J. Achiamet al., “GPT-4 technical report,”arXiv preprint arXiv:2303.08774, 2023. [Online]. Available: https://arxiv.org/abs/ 2303.08774
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[3]
Training language models to follow instructions with human feedback
L. Ouyanget al., “Training language models to follow instructions with human feedback,” inAdvances in Neural Information Processing Systems, vol. 35. Curran Associates, Inc., 2022, pp. 27 730–27 744. [Online]. Available: https://arxiv.org/abs/2203.02155
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[4]
ReAct: Synergizing Reasoning and Acting in Language Models
S. Yaoet al., “ReAct: Synergizing reasoning and acting in language models,” inInternational Conference on Learning Representations (ICLR), 2023. [Online]. Available: https://arxiv.org/abs/2210.03629
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[5]
HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in Hugging Face
Y . Shenet al., “HuggingGPT: Solving AI tasks with ChatGPT and its friends in Hugging Face,” inAdvances in Neural Information Processing Systems, vol. 36. Curran Associates, Inc., 2023. [Online]. Available: https://arxiv.org/abs/2303.17580
work page internal anchor Pith review arXiv 2023
-
[6]
MetaGPT: Meta Programming for A Multi-Agent Collaborative Framework
S. Honget al., “MetaGPT: Meta programming for a multi- agent collaborative framework,” inInternational Conference on Learning Representations (ICLR), 2024. [Online]. Available: https: //arxiv.org/abs/2308.00352
work page internal anchor Pith review arXiv 2024
-
[7]
Openclaw — the open-source personal ai assistant and autonomous agent,
OpenClaw, “Openclaw — the open-source personal ai assistant and autonomous agent,” https://open-claw.org/, 2026, official website, ac- cessed April 21, 2026
work page 2026
-
[8]
Welcome - manus documentation,
Manus, “Welcome - manus documentation,” https://manus.im/docs, 2026, official documentation, accessed April 21, 2026
work page 2026
-
[9]
Anthropic, “Claude code overview,” https://docs.anthropic.com/en/ docs/claude-code/overview, 2026, official documentation, accessed April 21, 2026
work page 2026
-
[10]
Introducing the model context protocol,
——, “Introducing the model context protocol,” https://www.anthropic. com/news/model-context-protocol, 2024, anthropic Blog, November 2024
work page 2024
-
[11]
Function calling and other API updates,
OpenAI, “Function calling and other API updates,” https://openai.com/ blog/function-calling-and-other-api-updates, 2023, openAI Blog, June 2023
work page 2023
-
[12]
Voyager: An Open-Ended Embodied Agent with Large Language Models
G. Wanget al., “V oyager: An open-ended embodied agent with large language models,”arXiv preprint arXiv:2305.16291, 2023. [Online]. Available: https://arxiv.org/abs/2305.16291
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[13]
Large language models as tool makers
T. Caiet al., “Large language models as tool makers,” inInternational Conference on Learning Representations (ICLR), 2024. [Online]. Available: https://arxiv.org/abs/2305.17126
-
[14]
CREATOR: Tool creation for disentangling abstract and concrete reasoning of large language models,
C. Qianet al., “CREATOR: Tool creation for disentangling abstract and concrete reasoning of large language models,” inFindings of the Association for Computational Linguistics: EMNLP 2023. Singapore: Association for Computational Linguistics, 2023, pp. 6922–6939. [Online]. Available: https://aclanthology.org/2023.findings-emnlp.462/
work page 2023
-
[15]
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks
P. Lewiset al., “Retrieval-augmented generation for knowledge- intensive NLP tasks,” inAdvances in Neural Information Processing Systems, vol. 33. Curran Associates, Inc., 2020, pp. 9459–9474. [Online]. Available: https://arxiv.org/abs/2005.11401
work page internal anchor Pith review Pith/arXiv arXiv 2020
-
[16]
Dense passage retrieval for open-domain question answering,
V . Karpukhinet al., “Dense passage retrieval for open-domain question answering,” inProceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). Online: Association for Computational Linguistics, 2020, pp. 6769–6781. [Online]. Available: https://aclanthology.org/2020.emnlp-main.550/
work page 2020
-
[17]
AnyTool: Self-reflective, hierarchical agents for large-scale API calls,
Y . Duet al., “AnyTool: Self-reflective, hierarchical agents for large- scale API calls,”arXiv preprint arXiv:2402.04253, 2024. [Online]. Available: https://arxiv.org/abs/2402.04253
-
[18]
AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation
Q. Wuet al., “AutoGen: Enabling next-gen LLM applications via multi-agent conversation,”arXiv preprint arXiv:2308.08155, 2023. [Online]. Available: https://arxiv.org/abs/2308.08155
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[19]
Reflexion: Language Agents with Verbal Reinforcement Learning
N. Shinnet al., “Reflexion: Language agents with verbal reinforcement learning,” inAdvances in Neural Information Processing Systems, vol. 36. Curran Associates, Inc., 2023. [Online]. Available: https://arxiv.org/abs/2303.11366
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[21]
Toolformer: Language Models Can Teach Themselves to Use Tools
T. Schicket al., “Toolformer: Language models can teach themselves to use tools,” inAdvances in Neural Information Processing Systems, vol. 36. Curran Associates, Inc., 2023. [Online]. Available: https://arxiv.org/abs/2302.04761
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[22]
ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs
Y . Qinet al., “ToolLLM: Facilitating large language models to master 16000+ real-world APIs,”arXiv preprint arXiv:2307.16789, 2023. [Online]. Available: https://arxiv.org/abs/2307.16789
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[24]
Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Models,
X. Yanget al., “Buffer of thoughts: Thought-augmented reasoning with large language models,”Advances in Neural Information Processing Systems, 2024. [Online]. Available: https://arxiv.org/abs/2406.04271
-
[28]
Do As I Can, Not As I Say: Grounding Language in Robotic Affordances
M. Ahnet al., “Do as i can, not as i say: Grounding language in robotic affordances,” inConference on Robot Learning (CoRL), 2022. [Online]. Available: https://arxiv.org/abs/2204.01691
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[29]
Z. Wanget al., “Describe, explain, plan and select: Interactive planning with large language models enables open-world multi-task agents,” 2023. [Online]. Available: https://arxiv.org/abs/2302.01560
-
[30]
Generative Agents: Interactive Simulacra of Human Behavior
J. S. Parket al., “Generative agents: Interactive simulacra of human behavior,” 2023. [Online]. Available: https://arxiv.org/abs/2304.03442
work page internal anchor Pith review arXiv 2023
-
[31]
X. Zhuet al., “Ghost in the Minecraft: Generally capable agents for open-world environments via large language models with text-based knowledge and memory,”arXiv preprint arXiv:2305.17144, 2023. [Online]. Available: https://arxiv.org/abs/2305.17144
-
[32]
Reasoning with language model is planning with world model,
S. Haoet al., “Reasoning with language model is planning with world model,” inProceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023, pp. 8154–8173
work page 2023
-
[33]
arXiv preprint arXiv:2308.02151 , year=
W. Yaoet al., “Retroformer: Retrospective large language agents with policy gradient optimization,”arXiv preprint arXiv:2308.02151, 2023
-
[34]
MemGPT: Towards LLMs as Operating Systems
C. Packeret al., “Memgpt: Towards LLMs as operating systems,” arXiv preprint arXiv:2310.08560, 2023. [Online]. Available: https: //arxiv.org/abs/2310.08560
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[36]
Think-in- memory: Recalling and post-thinking enable llms with long-term memory
[Online]. Available: https://arxiv.org/abs/2311.08719
-
[37]
Self-discover: Large language models self-compose reasoning structures,
P. Zhouet al., “Self-discover: Large language models self-compose reasoning structures,”Advances in Neural Information Processing Systems, vol. 37, pp. 126 032–126 058, 2024. JOURNAL OF LATEX CLASS FILES, VOL. 18, NO. 9, SEPTEMBER 2020 19
work page 2024
-
[38]
Optimizing generative ai by backpropagating language model feedback,
M. Yuksekgonulet al., “Optimizing generative ai by backpropagating language model feedback,”Nature, vol. 639, no. 8055, pp. 609–616, 2025
work page 2025
-
[39]
Y . Yuet al., “Fincon: A synthesized LLM multi-agent system with conceptual verbal reinforcement for enhanced financial decision making,”arXiv preprint arXiv:2407.06567, 2024. [Online]. Available: https://arxiv.org/abs/2407.06567
-
[40]
arXiv preprint arXiv:2502.00592 , year=
Y . Wanget al., “M+: Extending memoryllm with scalable long-term memory,” 2025. [Online]. Available: https://arxiv.org/abs/2502.00592
-
[41]
Enhancing reasoning with collaboration and memory,
J. Michelmanet al., “Enhancing reasoning with collaboration and memory,”arXiv preprint arXiv:2503.05944, 2025
-
[42]
Nemori: Self-organizing agent memory inspired by cognitive science,
a. others, “Nemori: Self-organizing agent memory inspired by cognitive science,”arXiv preprint arXiv:2502.14828, 2025. [Online]. Available: https://arxiv.org/abs/2502.14828
-
[43]
Intrinsic memory agents: Heterogeneous multi-agent llm systems through structured contextual memory,
——, “Intrinsic memory agents: Heterogeneous multi-agent llm systems through structured contextual memory,”arXiv preprint arXiv:2506.19413, 2025. [Online]. Available: https://arxiv.org/abs/ 2506.19413
-
[44]
Skill-Pro: Learning Reusable Skills from Experience via Non-Parametric PPO for LLM Agents
Q. Miet al., “Procmem: Learning reusable procedural memory from experience via non-parametric ppo for llm agents,” 2026. [Online]. Available: https://arxiv.org/abs/2602.01869
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[45]
arXiv preprint arXiv:2603.00718 , year=
S. Chenet al., “Skillcraft: Can LLM agents learn to use tools skillfully?”arXiv preprint arXiv:2603.00718, 2026. [Online]. Available: https://arxiv.org/abs/2603.00718
-
[46]
Polyskill: Learning generalizable skills through polymorphic abstraction,
a. others, “Polyskill: Learning generalizable skills through polymorphic abstraction,”International Conference on Learning Representations,
-
[47]
[Online]. Available: https://arxiv.org/abs/2510.15863
-
[49]
Kanzhi Cheng, Qiushi Sun, Yougang Chu, Fangzhi Xu, Yantao Li, Jianbing Zhang, and Zhiyong Wu
T. Chenet al., “Cua-skill: Develop skills for computer using agent,” arXiv preprint arXiv:2601.21123, 2026
-
[50]
Eureka: Human-Level Reward Design via Coding Large Language Models
Y . J. Maet al., “Eureka: Human-level reward design via coding large language models,” 2023. [Online]. Available: https://arxiv.org/ abs/2310.12931
work page internal anchor Pith review arXiv 2023
-
[51]
DS - Agent : Automated Data Science by Empowering Large Language Models with Case - Based Reasoning
X. Yueet al., “Ds-agent: Automated data science by empowering large language models with case-based reasoning,” 2024. [Online]. Available: https://arxiv.org/abs/2402.17453
-
[52]
arXiv preprint arXiv:2402.16906 , year=
X. Zhonget al., “Debug like a human: A large language model debugger via verifying runtime execution step-by-step,” 2024. [Online]. Available: https://arxiv.org/abs/2402.16906
-
[53]
Executable code actions elicit better LLM agents,
X. Wanget al., “Executable code actions elicit better LLM agents,”arXiv preprint arXiv:2402.01030, 2024. [Online]. Available: https://arxiv.org/abs/2402.01030
-
[54]
SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering
J. Yanget al., “Swe-agent: Agent-computer interfaces enable automated software engineering,” 2024. [Online]. Available: https: //arxiv.org/abs/2405.15793
work page internal anchor Pith review arXiv 2024
-
[55]
Toolcoder: Teach code generation models to use api search tools,
K. Zhanget al., “Toolcoder: Teach code generation models to use api search tools,” 2023. [Online]. Available: https://arxiv.org/abs/2305. 04032
work page 2023
-
[56]
Evolving programmatic skill networks,
H. Shiet al., “Evolving programmatic skill networks,” 2026. [Online]. Available: https://arxiv.org/abs/2601.03509
-
[57]
JARVIS-1: Open- world multi-task agents with memory-augmented multimodal lan- guage models,
Z. Wanget al., “Jarvis-1: Open-world multi-task agents with memory-augmented multimodal language models,”arXiv preprint arXiv:2311.05997, 2023. [Online]. Available: https://arxiv.org/abs/ 2311.05997
- [59]
-
[61]
arXiv preprint arXiv:2603.02176 , year=
H. Liet al., “Organizing, orchestrating, and benchmarking agent skills at ecosystem scale,”arXiv preprint arXiv:2603.02176, 2026. [Online]. Available: https://arxiv.org/abs/2603.02176
-
[62]
TPTU: Task planning and tool usage of large language model-based AI agents,
J. Ruanet al., “Tptu: large language model-based ai agents for task planning and tool usage,”arXiv preprint arXiv:2308.03427, 2023
-
[63]
https://arxiv.org/abs/2410.08328
K. Christakopoulouet al., “Agents thinking fast and slow: A talker- reasoner architecture,”arXiv preprint arXiv:2410.08328, 2024
-
[64]
a. others, “Llm-powered decentralized generative agents with adaptive hierarchical knowledge graph for cooperative planning,” arXiv preprint arXiv:2502.05453, 2025. [Online]. Available: https: //arxiv.org/abs/2502.05453
-
[65]
F. Wanget al., “Graphskill: Documentation-guided hierarchical retrieval-augmented coding for complex graph reasoning,” 2026. [Online]. Available: https://arxiv.org/abs/2603.06620
-
[66]
J. Qiuet al., “Alita: Generalist agent enabling scalable agentic rea- soning with minimal predefinition and maximal self-evolution,”arXiv preprint arXiv:2505.20286, 2025
-
[67]
Skillnet: Create, evaluate, and connect ai skills,
Y . Lianget al., “Skillnet: Create, evaluate, and connect ai skills,”
-
[68]
SkillNet: Create, evaluate, and connect AI skills,
[Online]. Available: https://arxiv.org/abs/2603.04448
-
[69]
Sok: Agentic skills – beyond tool use in llm agents,
Y . Jianget al., “Sok: Agentic skills – beyond tool use in llm agents,”
-
[70]
SoK: Agentic Skills -- Beyond Tool Use in LLM Agents
[Online]. Available: https://arxiv.org/abs/2602.20867
work page internal anchor Pith review arXiv
-
[71]
Skills are the new apps – now it’s time for skill os,
L. Chenet al., “Skills are the new apps – now it’s time for skill os,” 2026, preprints.org manuscript 202602.1096.v1. [Online]. Available: https://www.preprints.org/manuscript/202602.1096/v1
-
[72]
arXiv preprint arXiv:2405.02957 , year =
J. Liet al., “Agent hospital: A simulacrum of hospital with evolvable medical agents,”arXiv preprint arXiv:2405.02957, 2024. [Online]. Available: https://arxiv.org/abs/2405.02957
-
[73]
C. Huet al., “Evermemos: A self-organizing memory operating system for structured long-horizon reasoning,” 2026. [Online]. Available: https://arxiv.org/abs/2601.02163
-
[74]
HyperMem: Hypergraph Memory for Long-Term Conversations
L. Yueet al., “Hypermem: Hypergraph memory for long-term conversations,” 2026, accepted to ACL 2026 Main. [Online]. Available: https://arxiv.org/abs/2604.08256
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[75]
arXiv preprint arXiv:2506.07398 , year=
G. Zhanget al., “G-memory: Tracing hierarchical memory for multi- agent systems, 2025,”URL https://arxiv. org/abs/2506.07398
-
[76]
Agentevolver: Towards efficient self-evolving agent system.arXiv preprint arXiv:2511.10395, 2025
a. others, “Agentevolver: Towards efficient self-evolving agent system,”arXiv preprint arXiv:2511.10395, 2025. [Online]. Available: https://arxiv.org/abs/2511.10395
-
[77]
15 Yuhong Cao, Jeric Lew, Jingsong Liang, Jin Cheng, and Guillaume Sartoretti
Y . Caiet al., “Building self-evolving agents via experience-driven lifelong learning: A framework and benchmark,” 2025. [Online]. Available: https://arxiv.org/abs/2508.19005
-
[78]
L. Qiuet al., “Autorefine: From trajectories to reusable expertise for continual llm agent refinement,” 2026. [Online]. Available: https://arxiv.org/abs/2601.22758
-
[79]
Cradle: Empowering foundation agents towards general computer control,
W. Tanet al., “Cradle: Empowering foundation agents towards general computer control,”arXiv preprint arXiv:2403.03186, 2024. [Online]. Available: https://arxiv.org/abs/2403.03186
-
[80]
AppAgent: Multimodal agents as smartphone users,
C. Zhanget al., “Appagent: Multimodal agents as smartphone users,”arXiv preprint arXiv:2312.13771, 2023. [Online]. Available: https://arxiv.org/abs/2312.13771
-
[81]
Y . Fuet al., “Autoguide: Automated generation and selection of context-aware guidelines for large language model agents,” arXiv preprint arXiv:2403.08978, 2024. [Online]. Available: https: //arxiv.org/abs/2403.08978
-
[82]
WebArena: A Realistic Web Environment for Building Autonomous Agents
S. Zhouet al., “WebArena: A realistic web environment for building autonomous agents,” inInternational Conference on Learning Representations (ICLR), 2024. [Online]. Available: https: //arxiv.org/abs/2307.13854
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[83]
Don't Retrieve, Navigate: Distilling Enterprise Knowledge into Navigable Agent Skills for QA and RAG
Y . Sunet al., “Don’t retrieve, navigate: Distilling enterprise knowledge into navigable agent skills for qa and rag,”arXiv preprint arXiv:2604.14572, Apr. 2026. [Online]. Available: https: //arxiv.org/abs/2604.14572
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[84]
arXiv preprint arXiv:2506.14728 , year=
J. Qiuet al., “Agentdistill: Training-free agent distillation with gener- alizable mcp boxes,”arXiv preprint arXiv:2506.14728, 2025
-
[85]
Reinforcement learning for self-improving agent with skill library, 2025
J. Wanget al., “Reinforcement learning for self-improving agent with skill library,”arXiv preprint arXiv:2512.17102, 2025
-
[86]
arXiv preprint arXiv:2603.01145 , year=
Y . Yanget al., “Autoskill: Experience-driven lifelong learning via skill self-evolution,” 2026. [Online]. Available: https://arxiv.org/abs/ 2603.01145
-
[87]
MemSkill: Learning and Evolving Memory Skills for Self-Evolving Agents
H. Zhanget al., “Memskill: Learning and evolving memory skills for self-evolving agents,” 2026. [Online]. Available: https: //arxiv.org/abs/2602.02474
work page internal anchor Pith review arXiv 2026
- [88]
-
[89]
SkillWeaver: Web Agents can Self-Improve by Discovering and Honing Skills
B. Zhenget al., “Skillweaver: Web agents can self-improve by discovering and honing skills,” 2025. [Online]. Available: https://arxiv.org/abs/2504.07079
work page internal anchor Pith review arXiv 2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.