Autolearn uses high-loss passages and self-generated Q&A training to drive the perturbation gap below baseline, improving novel fact acquisition while suppressing memorization in language models.
Alas: Autonomous learning agent for self-updating language models
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
background 2
citation-polarity summary
roles
background 2polarities
background 2representative citing papers
Survey that defines agentic RL for LLMs via POMDPs, introduces a taxonomy of planning/tool-use/memory/reasoning capabilities and domains, and compiles open environments from over 500 papers.
citing papers explorer
-
Autolearn: Learn by Surprise, Commit by Proof
Autolearn uses high-loss passages and self-generated Q&A training to drive the perturbation gap below baseline, improving novel fact acquisition while suppressing memorization in language models.
-
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey
Survey that defines agentic RL for LLMs via POMDPs, introduces a taxonomy of planning/tool-use/memory/reasoning capabilities and domains, and compiles open environments from over 500 papers.