pith. machine review for the scientific record. sign in

arxiv: 2602.01869 · v2 · submitted 2026-02-02 · 💻 cs.AI

Recognition: unknown

Skill-Pro: Learning Reusable Skills from Experience via Non-Parametric PPO for LLM Agents

Authors on Pith no claims yet
classification 💻 cs.AI
keywords skill-proagentsproceduralskillsdemonstrateexecutionexperiencehigh-quality
0
0 comments X
read the original abstract

LLM-driven agents demonstrate strong performance in sequential decision-making but often rely on on-the-fly reasoning, re-deriving solutions even in recurring scenarios. This insufficient experience reuse leads to computational redundancy and execution instability. To bridge this gap, we propose Skill-Pro, a framework that enables agents to autonomously learn reusable procedural skills from interaction experiences without parameter updates. By formalizing a Skill-MDP, Skill-Pro transforms passive episodic narratives into executable Skills defined by activation, execution, and termination conditions to ensure executability. To achieve reliable reusability without capability degradation, we introduce Non-Parametric PPO, which leverages semantic gradients for high-quality candidate generation and a PPO Gate for robust Skill verification. Through score-based maintenance, Skill-Pro sustains compact, high-quality procedural memory. Experimental results across in-domain, cross-task, and cross-agent scenarios demonstrate that Skill-Pro achieves superior reuse rates and significant performance gains with extreme memory compression. Visualized evolutionary trajectories and Skill distributions further reveal how Skill-Pro transparently accumulates, refines, and reuses procedural knowledge to facilitate long-term autonomy.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 8 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Skill-CMIB: Multimodal Agent Skill for Consistent Action via Conditional Multimodal Information Bottleneck

    cs.LG 2026-05 unverdicted novelty 7.0

    CMIB uses a conditional multimodal information bottleneck to create reusable agent skills that separate verbalizable text content from predictive perceptual residuals, improving execution stability.

  2. MEMOREPAIR: Barrier-First Cascade Repair in Agentic Memory

    cs.AI 2026-05 unverdicted novelty 7.0

    MemoRepair formalizes the cascade update problem in agentic memory and solves it via a min-cut reduction that eliminates invalidated memory exposure to 0% while recovering 91-94% of valid successors at 57-76% of basel...

  3. Co-Evolving LLM Decision and Skill Bank Agents for Long-Horizon Tasks

    cs.AI 2026-04 unverdicted novelty 7.0

    COSPLAY co-evolves an LLM decision agent with a skill bank agent to improve long-horizon game performance, reporting over 25.1% average reward gains versus frontier LLM baselines on single-player benchmarks.

  4. MAP: A Map-then-Act Paradigm for Long-Horizon Interactive Agent Reasoning

    cs.AI 2026-05 unverdicted novelty 6.0

    MAP improves LLM agent reasoning by constructing a structured cognitive map of the environment before task execution, yielding performance gains on benchmarks like ARC-AGI-3 and superior training data via the new MAP-...

  5. Dynamic Skill Lifecycle Management for Agentic Reinforcement Learning

    cs.LG 2026-05 unverdicted novelty 6.0

    SLIM dynamically optimizes active external skills in agentic RL via leave-one-skill-out marginal contribution estimates and three lifecycle operations, outperforming baselines by 7.1% on ALFWorld and SearchQA while sh...

  6. SkillLens: Adaptive Multi-Granularity Skill Reuse for Cost-Efficient LLM Agents

    cs.AI 2026-05 unverdicted novelty 6.0

    SkillLens organizes skills into policies-strategies-procedures-primitives layers, retrieves via degree-corrected random walk, and uses a verifier for local adaptation, yielding up to 6.31 pp gains on MuLocbench and ra...

  7. From Procedural Skills to Strategy Genes: Towards Experience-Driven Test-Time Evolution

    cs.SE 2026-04 unverdicted novelty 5.0

    Compact Gene representations of experience outperform documentation-oriented Skill packages for test-time control and iterative evolution in code-solving tasks, with measured gains on CritPt from 9.1% to 18.57% and 17...

  8. A Comprehensive Survey on Agent Skills: Taxonomy, Techniques, and Applications

    cs.IR 2026-05 unverdicted novelty 4.0

    The paper surveys agent skills for LLM agents, organizing the literature into a four-stage lifecycle of representation, acquisition, retrieval, and evolution while highlighting their role in system scalability.