hub Mixed citations

CoEvoSkills: Self-Evolving Agent Skills via Co-Evolutionary Verification

· 2026 · cs.AI · arXiv 2604.01687

Mixed citation behavior. Most common role is background (50%).

19 Pith papers citing it

Background 50% of classified citations

open full Pith review browse 19 citing papers arXiv PDF

abstract

Anthropic proposes the concept of skills for LLM agents to tackle multi-step professional tasks that simple tool invocations cannot address. A tool is a single, self-contained function, whereas a skill is a structured bundle of interdependent multi-file artifacts. Currently, skill generation is not only label-intensive due to manual authoring, but also may suffer from human--machine cognitive misalignment, which can lead to degraded agent performance, as evidenced by evaluations on SkillsBench. Therefore, we aim to enable agents to autonomously generate skills. However, existing self-evolving methods designed for tools cannot be directly applied to skills due to their increased complexity. To address these issues, we propose CoEvoSkills, a self-evolving skills framework that enables agents to autonomously construct complex, multi-file skill packages. Specifically, CoEvoSkills couples a Skill Generator that iteratively refines skills with a Surrogate Verifier that co-evolves to provide informative and actionable feedback without access to ground-truth test content. On SkillsBench, CoEvoSkills achieves the highest pass rate among five baselines on both Claude Code and Codex, and also exhibits strong generalization capabilities to six additional LLMs.

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 4 dataset 1 method 1

citation-polarity summary

background 3 unclear 1 use dataset 1 use method 1

representative citing papers

Residual Skill Optimization for Text-to-SQL Ensembles

cs.CL · 2026-05-20 · unverdicted · novelty 7.0

Residual skill optimization creates complementary Text-to-SQL agents by training each new skill on prior ensemble failures, yielding accuracy gains on Spider2-Lite and transfer to other dialects and tasks.

SkillMOO: Multi-Objective Optimization of Agent Skills for Software Engineering

cs.SE · 2026-04-10 · unverdicted · novelty 7.0 · 2 refs

SkillMOO applies LLM-proposed edits and NSGA-II Pareto optimization to skill bundles for SE agents, ranking top in pass rate on most SkillsBench tasks while cutting costs up to 31.7%.

Harnessing Agent Skills: Architectural Patterns and a Reference Architecture for Skill-Mediated LLM Agents

cs.AI · 2026-05-29 · unverdicted · novelty 6.0

Catalogs ten patterns and synthesizes a four-layer reference architecture for skill harnessing in LLM agents, evaluated via cross-instantiation on eight systems.

Distributionally Robust Set Representation Learning Under Inference-Time Element Corruption

cs.LG · 2026-05-28 · unverdicted · novelty 6.0

SW-DRSO optimizes a tractable surrogate of worst-case expected loss over plausible inference-time corruptions using a barycentric adversary approximated via simplex weights.

From Raw Experience to Skill Consumption: A Systematic Study of Model-Generated Agent Skills

cs.AI · 2026-05-22 · unverdicted · novelty 6.0

A systematic study across five domains finds model-generated skills yield average gains but non-uniform negative transfer, with a meta-skill improving extraction quality.

AgentCo-op: Retrieval-Based Synthesis of Interoperable Multi-Agent Workflows

cs.AI · 2026-05-19 · unverdicted · novelty 6.0

AgentCo-op retrieves and assembles existing agents and tools into interoperable workflows for open-world scientific tasks, showing effectiveness in genomics case studies and competitive benchmark results with lower costs.

SkillGen: Verified Inference-Time Agent Skill Synthesis

cs.LG · 2026-05-09 · unverdicted · novelty 6.0

SkillGen synthesizes auditable skills from agent trajectories via contrastive induction on successes and failures, then verifies net performance impact by comparing outcomes with and without the skill on identical tasks.

SkillMaster: Toward Autonomous Skill Mastery in LLM Agents

cs.AI · 2026-05-09 · unverdicted · novelty 6.0 · 2 refs

SkillMaster enables LLM agents to autonomously develop skills via trajectory review, counterfactual evaluation, and DualAdv-GRPO training, boosting success rates by 8.8% on ALFWorld and 9.3% on WebShop.

SkillLens: Adaptive Multi-Granularity Skill Reuse for Cost-Efficient LLM Agents

cs.AI · 2026-05-08 · unverdicted · novelty 6.0

SkillLens organizes skills into policies-strategies-procedures-primitives layers, retrieves via degree-corrected random walk, and uses a verifier for local adaptation, yielding up to 6.31 pp gains on MuLocbench and raising ALFWorld success from 45% to 51.31%.

ClawTrace: Cost-Aware Tracing for LLM Agent Skill Distillation

cs.AI · 2026-04-26 · unverdicted · novelty 6.0

ClawTrace enables cost-aware LLM agent skill distillation by tracing per-step costs and generating preserve, prune, and repair patches, with ablations showing reduced regressions and prune rules transferring to cut costs by 32%.

GAM: Hierarchical Graph-based Agentic Memory for LLM Agents

cs.AI · 2026-04-14 · unverdicted · novelty 6.0

GAM decouples event-level memory encoding from topic-level consolidation in LLM agents using hierarchical graphs to reduce interference and improve long-term coherence and retrieval.

SkillSmith: Co-Evolving Skills and Tools for Self-Improving Agent Systems

cs.AI · 2026-05-31 · unverdicted · novelty 5.0

SkillSmith introduces a synergy-aware skill-tool co-evolution framework with atomic bundles, Lotka-Volterra-inspired interaction modeling, and anti-pattern recording that outperforms baselines on complex tasks.

SkillsVote: Lifecycle Governance of Agent Skills from Collection, Recommendation to Evolution

cs.CL · 2026-05-18 · unverdicted · novelty 5.0

SkillsVote is a governance system for agent skills that profiles corpora, recommends via search, and gates updates on successful reusable outcomes, yielding benchmark gains without model changes.

Evolutionary Ensemble of Agents

cs.NE · 2026-05-09 · unverdicted · novelty 5.0 · 2 refs

EvE co-evolves code solvers and guidance states via synchronous races and Elo updates, discovering a rescale-then-interpolate mechanism that enables example-count generalization in ICON.

Ace-Skill: Bootstrapping Multimodal Agents with Prioritized and Clustered Evolution

cs.AI · 2026-05-09 · unverdicted · novelty 5.0

Ace-Skill boosts multimodal agent self-evolution via prioritized rollouts with lazy-decay tracking and semantic knowledge clustering, yielding up to 35% relative gains on tool-use benchmarks and zero-shot transfer to smaller models.

SkillOpt: Executive Strategy for Self-Evolving Agent Skills

cs.AI · 2026-05-22

A Comprehensive Survey on Agent Skills: Taxonomy, Techniques, and Applications

cs.IR · 2026-05-08

From Context to Skills: Can Language Models Learn from Context Skillfully?

cs.AI · 2026-04-30

EvoAgent: An Evolvable Agent Framework with Skill Learning and Multi-Agent Delegation

cs.AI · 2026-04-22

citing papers explorer

Showing 19 of 19 citing papers.

Residual Skill Optimization for Text-to-SQL Ensembles cs.CL · 2026-05-20 · unverdicted · none · ref 49 · internal anchor
Residual skill optimization creates complementary Text-to-SQL agents by training each new skill on prior ensemble failures, yielding accuracy gains on Spider2-Lite and transfer to other dialects and tasks.
SkillMOO: Multi-Objective Optimization of Agent Skills for Software Engineering cs.SE · 2026-04-10 · unverdicted · none · ref 10 · 2 links · internal anchor
SkillMOO applies LLM-proposed edits and NSGA-II Pareto optimization to skill bundles for SE agents, ranking top in pass rate on most SkillsBench tasks while cutting costs up to 31.7%.
Harnessing Agent Skills: Architectural Patterns and a Reference Architecture for Skill-Mediated LLM Agents cs.AI · 2026-05-29 · unverdicted · none · ref 67 · internal anchor
Catalogs ten patterns and synthesizes a four-layer reference architecture for skill harnessing in LLM agents, evaluated via cross-instantiation on eight systems.
Distributionally Robust Set Representation Learning Under Inference-Time Element Corruption cs.LG · 2026-05-28 · unverdicted · none · ref 14 · internal anchor
SW-DRSO optimizes a tractable surrogate of worst-case expected loss over plausible inference-time corruptions using a barycentric adversary approximated via simplex weights.
From Raw Experience to Skill Consumption: A Systematic Study of Model-Generated Agent Skills cs.AI · 2026-05-22 · unverdicted · none · ref 10 · internal anchor
A systematic study across five domains finds model-generated skills yield average gains but non-uniform negative transfer, with a meta-skill improving extraction quality.
AgentCo-op: Retrieval-Based Synthesis of Interoperable Multi-Agent Workflows cs.AI · 2026-05-19 · unverdicted · none · ref 28 · internal anchor
AgentCo-op retrieves and assembles existing agents and tools into interoperable workflows for open-world scientific tasks, showing effectiveness in genomics case studies and competitive benchmark results with lower costs.
SkillGen: Verified Inference-Time Agent Skill Synthesis cs.LG · 2026-05-09 · unverdicted · none · ref 17 · internal anchor
SkillGen synthesizes auditable skills from agent trajectories via contrastive induction on successes and failures, then verifies net performance impact by comparing outcomes with and without the skill on identical tasks.
SkillMaster: Toward Autonomous Skill Mastery in LLM Agents cs.AI · 2026-05-09 · unverdicted · none · ref 17 · 2 links · internal anchor
SkillMaster enables LLM agents to autonomously develop skills via trajectory review, counterfactual evaluation, and DualAdv-GRPO training, boosting success rates by 8.8% on ALFWorld and 9.3% on WebShop.
SkillLens: Adaptive Multi-Granularity Skill Reuse for Cost-Efficient LLM Agents cs.AI · 2026-05-08 · unverdicted · none · ref 32 · internal anchor
SkillLens organizes skills into policies-strategies-procedures-primitives layers, retrieves via degree-corrected random walk, and uses a verifier for local adaptation, yielding up to 6.31 pp gains on MuLocbench and raising ALFWorld success from 45% to 51.31%.
ClawTrace: Cost-Aware Tracing for LLM Agent Skill Distillation cs.AI · 2026-04-26 · unverdicted · none · ref 3 · internal anchor
ClawTrace enables cost-aware LLM agent skill distillation by tracing per-step costs and generating preserve, prune, and repair patches, with ablations showing reduced regressions and prune rules transferring to cut costs by 32%.
GAM: Hierarchical Graph-based Agentic Memory for LLM Agents cs.AI · 2026-04-14 · unverdicted · none · ref 31 · internal anchor
GAM decouples event-level memory encoding from topic-level consolidation in LLM agents using hierarchical graphs to reduce interference and improve long-term coherence and retrieval.
SkillSmith: Co-Evolving Skills and Tools for Self-Improving Agent Systems cs.AI · 2026-05-31 · unverdicted · none · ref 16 · internal anchor
SkillSmith introduces a synergy-aware skill-tool co-evolution framework with atomic bundles, Lotka-Volterra-inspired interaction modeling, and anti-pattern recording that outperforms baselines on complex tasks.
SkillsVote: Lifecycle Governance of Agent Skills from Collection, Recommendation to Evolution cs.CL · 2026-05-18 · unverdicted · none · ref 70 · internal anchor
SkillsVote is a governance system for agent skills that profiles corpora, recommends via search, and gates updates on successful reusable outcomes, yielding benchmark gains without model changes.
Evolutionary Ensemble of Agents cs.NE · 2026-05-09 · unverdicted · none · ref 33 · 2 links · internal anchor
EvE co-evolves code solvers and guidance states via synchronous races and Elo updates, discovering a rescale-then-interpolate mechanism that enables example-count generalization in ICON.
Ace-Skill: Bootstrapping Multimodal Agents with Prioritized and Clustered Evolution cs.AI · 2026-05-09 · unverdicted · none · ref 26 · internal anchor
Ace-Skill boosts multimodal agent self-evolution via prioritized rollouts with lazy-decay tracking and semantic knowledge clustering, yielding up to 35% relative gains on tool-use benchmarks and zero-shot transfer to smaller models.
SkillOpt: Executive Strategy for Self-Evolving Agent Skills cs.AI · 2026-05-22 · unreviewed · ref 21 · internal anchor
A Comprehensive Survey on Agent Skills: Taxonomy, Techniques, and Applications cs.IR · 2026-05-08 · unreviewed · ref 107 · internal anchor
From Context to Skills: Can Language Models Learn from Context Skillfully? cs.AI · 2026-04-30 · unreviewed · ref 48 · internal anchor
EvoAgent: An Evolvable Agent Framework with Skill Learning and Multi-Agent Delegation cs.AI · 2026-04-22 · unreviewed · ref 4 · internal anchor

CoEvoSkills: Self-Evolving Agent Skills via Co-Evolutionary Verification

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer