Autorefine: From trajectories to reusable expertise for continual llm agent refinement

· 2026 · arXiv 2601.22758

9 Pith papers cite this work. Polarity classification is still indexing.

9 Pith papers citing it

read on arXiv browse 9 citing papers

citation-role summary

background 4

citation-polarity summary

background 4

representative citing papers

Test-Time Learning with an Evolving Library

cs.LG · 2026-05-14 · unverdicted · novelty 7.0

EvoLib enables LLMs to accumulate, reuse, and evolve knowledge abstractions from inference trajectories at test time, yielding substantial gains on math reasoning, code generation, and agentic benchmarks without parameter updates or supervision.

Generate, Filter, Control, Replay: A Comprehensive Survey of Rollout Strategies for LLM Reinforcement Learning

cs.LG · 2026-04-08 · unverdicted · novelty 7.0

This survey introduces the Generate-Filter-Control-Replay (GFCR) taxonomy to structure rollout pipelines for RL-based post-training of reasoning LLMs.

Hierarchical Experimentalist Agents

cs.AI · 2026-06-28 · unverdicted · novelty 6.0

HExA is a training-free agent framework that improves LLM performance on novel physics tasks from 2% to 77% by iteratively designing experiments and composing learned skills.

SkillOpt: Executive Strategy for Self-Evolving Agent Skills

cs.AI · 2026-05-22 · unverdicted · novelty 6.0 · 2 refs

SkillOpt introduces a controllable text-space optimizer that evolves agent skills via add/delete/replace edits accepted only on strict held-out validation improvement, reporting consistent gains across 52 model-benchmark-harness combinations.

From Raw Experience to Skill Consumption: A Systematic Study of Model-Generated Agent Skills

cs.AI · 2026-05-22 · unverdicted · novelty 6.0

A systematic study across five domains finds model-generated skills yield average gains but non-uniform negative transfer, with a meta-skill improving extraction quality.

Evidence Over Plans: Online Trajectory Verification for Skill Distillation

cs.AI · 2026-05-09 · unverdicted · novelty 6.0

SPARK generates environment-verified trajectories to compute PDI, enabling posterior skill distillation that outperforms no-skill baselines and human-written skills across 86 tasks with up to 1000x cheaper inference.

A Comprehensive Survey on Agent Skills: Taxonomy, Techniques, and Applications

cs.IR · 2026-05-08 · unverdicted · novelty 3.0 · 3 refs

A survey that defines agent skills as reusable procedural artifacts and reviews methods, resources, and applications across their representation, acquisition, retrieval, and evolution stages.

From Context to Skills: Can Language Models Learn from Context Skillfully?

cs.AI · 2026-04-30

From Procedural Skills to Strategy Genes: Towards Experience-Driven Test-Time Evolution

cs.SE · 2026-04-16

citing papers explorer

Showing 1 of 1 citing paper after filters.

A Comprehensive Survey on Agent Skills: Taxonomy, Techniques, and Applications cs.IR · 2026-05-08 · unverdicted · none · ref 76 · 3 links
A survey that defines agent skills as reusable procedural artifacts and reviews methods, resources, and applications across their representation, acquisition, retrieval, and evolution stages.

Autorefine: From trajectories to reusable expertise for continual llm agent refinement

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer