Experience Compression Spectrum: Unifying Memory, Skills, and Rules in LLM Agents

Bing Zhu, Guanghui Wang, Peiyang He, Wei Qiu, Xing Zhang, Yanwei Cui, Ziyuan Li

Pith reviewed 2026-05-10 08:41 UTC · model grok-4.3

classification 💻 cs.AI cs.CLcs.MA

keywords compressionmemoryagentexperiencerulesskillsspectrumsystems

0 comments

The pith

The Experience Compression Spectrum unifies memory, skills, and rules in LLM agents along increasing compression levels and identifies the absence of adaptive cross-level compression as the missing diagonal.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

LLM agents accumulate long histories of interactions. Instead of keeping every detail, systems can compress that history in different ways. Episodic memory keeps fairly detailed records and compresses only modestly. Procedural skills turn repeated patterns into reusable procedures that compress more. Declarative rules extract general principles that compress the most. The paper places these three kinds of knowledge on one line ordered by compression ratio and shows that existing systems each sit at one fixed point on that line. No current system can move fluidly between levels when needed. The authors also note that the memory community and the skill-discovery community almost never cite each other even though they solve overlapping problems. Evaluation benchmarks are tied to whichever compression level a system uses, so it is hard to compare across levels. Transfer to new tasks improves as compression increases, but the knowledge becomes less specific. The paper ends by listing open problems for building agents that can manage the full range of compression.

Core claim

Mapping 20+ systems onto this spectrum reveals that every system operates at a fixed, predetermined compression level -- none supports adaptive cross-level compression, a gap we term the missing diagonal.

Load-bearing premise

That the compression ratios (5-20x, 50-500x, 1000x+) assigned to memory, skills, and rules are comparable across heterogeneous systems and that the low cross-community citation rate directly implies independent solving of shared sub-problems without solution exchange.

Figures

Figures reproduced from arXiv: 2604.15877 by Bing Zhu, Guanghui Wang, Peiyang He, Wei Qiu, Xing Zhang, Yanwei Cui, Ziyuan Li.

**Figure 1.** Figure 1: The Experience Compression Spectrum. Existing agent learning systems map onto a single axis from raw traces to abstract rules. Memory systems cluster at Level 1, skill systems at Level 2, with Level 3 largely empty. A small number of cross-level systems (dashed) bridge Levels 1–2 but none support adaptive level selection. Compression ratios are approximate. that implies agent systems should perform upward … view at source ↗

read the original abstract

As LLM agents scale to long-horizon, multi-session deployments, efficiently managing accumulated experience becomes a critical bottleneck. Agent memory systems and agent skill discovery both address this challenge -- extracting reusable knowledge from interaction traces -- yet a citation analysis of 1,136 references across 22 primary papers reveals a cross-community citation rate below 1%. We propose the \emph{Experience Compression Spectrum}, a unifying framework that positions memory, skills, and rules as points along a single axis of increasing compression (5--20$\times$ for episodic memory, 50--500$\times$ for procedural skills, 1,000$\times$+ for declarative rules), directly reducing context consumption, retrieval latency, and compute overhead. Mapping 20+ systems onto this spectrum reveals that every system operates at a fixed, predetermined compression level -- none supports adaptive cross-level compression, a gap we term the \emph{missing diagonal}. We further show that specialization alone is insufficient -- both communities independently solve shared sub-problems without exchanging solutions -- that evaluation methods are tightly coupled to compression levels, that transferability increases with compression at the cost of specificity, and that knowledge lifecycle management remains largely neglected. We articulate open problems and design principles for scalable, full-spectrum agent learning systems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 2 invented entities

The paper rests on the domain assumption that compression can be measured uniformly across memory, skill, and rule systems and on the ad-hoc definition of the three compression bands; no free parameters are fitted and no new physical entities are postulated.

axioms (2)

domain assumption Memory, skills, and rules can be ordered along a single axis of increasing compression
Invoked when the spectrum is proposed and when systems are mapped onto it.
domain assumption Low cross-citation rate indicates independent solution of shared sub-problems
Used to interpret the 1,136-reference analysis.

invented entities (2)

Experience Compression Spectrum no independent evidence
purpose: Unifying axis for memory, skills, and rules
Newly proposed organizing framework
missing diagonal no independent evidence
purpose: Label for the absence of adaptive cross-level compression
New term for the identified gap

pith-pipeline@v0.9.0 · 5541 in / 1594 out tokens · 19632 ms · 2026-05-10T08:41:12.222339+00:00 · methodology

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Dynamic Skill Lifecycle Management for Agentic Reinforcement Learning
cs.LG 2026-05 unverdicted novelty 6.0

SLIM dynamically optimizes active external skills in agentic RL via leave-one-skill-out marginal contribution estimates and three lifecycle operations, outperforming baselines by 7.1% on ALFWorld and SearchQA while sh...

Reference graph

Works this paper leans on

30 extracted references · 30 canonical work pages · cited by 1 Pith paper · 16 internal anchors

[1]

Evoskill: Automated skill discovery for multi-agent systems.arXiv preprint arXiv:2603.02766,

Alzubi, S., Provenzano, N., Bingham, J., Chen, W., and Vu, T. EvoSkill: Automated skill discovery for multi-agent 6 Experience Compression Spectrum systems.arXiv preprint arXiv:2603.02766,

work page arXiv
[2]

Constitutional AI: Harmlessness from AI Feedback

Bai, Y ., Kadavath, S., Kundu, S., Askell, A., Kernion, J., Jones, A., Chen, A., Goldie, A., Mirhoseini, A., McKin- non, C., et al. Constitutional AI: Harmlessness from AI feedback.arXiv preprint arXiv:2212.08073,

work page internal anchor Pith review Pith/arXiv arXiv
[3]

SEVerA: Verified Synthesis of Self-Evolving Agents

Banerjee, D., Xu, C., and Singh, G. SEVerA: Veri- fied synthesis of self-evolving agents.arXiv preprint arXiv:2603.25111,

work page internal anchor Pith review Pith/arXiv arXiv
[4]

Composer 2 technical report.arXiv preprint arXiv:2603.24477, 2026

Chan, A., Shalaby, A., Wettig, A., Sanger, A., Zhai, A., Ajay, A., Nair, A., Snell, C., Lu, C., Shen, C., Jia, E., Cassano, F., Liu, H., Chen, H., et al. Composer 2 technical report. arXiv preprint arXiv:2603.24477,

work page arXiv
[5]

Mem0: Building Production-Ready AI Agents with Scalable Long-Term Memory

Chhikara, P., Khant, D., Aryan, S., Singh, T., and Yadav, D. Mem0: Building production-ready AI agents with scalable long-term memory.arXiv preprint arXiv:2504.19413,

work page internal anchor Pith review arXiv
[6]

Lightmem: Lightweight and efficient memory-augmented generation.arXiv preprint arXiv:2510.18866, 2025

Fang, J., Deng, X., Xu, H., Jiang, Z., Tang, Y ., Xu, Z., Deng, S., Yao, Y ., Wang, M., Qiao, S., Chen, H., and Zhang, N. LightMem: Lightweight and efficient memory- augmented generation.arXiv preprint arXiv:2510.18866,

work page arXiv
[7]

Reinforced Self-Training (ReST) for Language Modeling

Gulcehre, C., Paine, T. L., Srinivasan, S., Konyushkova, K., Weerts, L., Sharma, A., Siddhant, A., Ahern, A., Wang, M., Gu, C., et al. Reinforced self-training (ReST) for language modeling.arXiv preprint arXiv:2308.08998,

work page Pith review arXiv
[8]

Memory in the Age of AI Agents

Hu, Y ., Liu, S., Yue, Y ., Zhang, G., Liu, B., Zhu, F., Lin, J., Guo, H., Dou, S., Xi, Z., et al. Memory in the age of AI agents.arXiv preprint arXiv:2512.13564,

work page internal anchor Pith review arXiv
[9]

Cascade: Cumulative agentic skill creation through autonomous development and evolution

Huang, X., Chen, J., Fei, Y ., Li, Z., Schwaller, P., and Ceder, G. CASCADE: Cumulative agentic skill creation through autonomous development and evolution.arXiv preprint arXiv:2512.23880,

work page arXiv
[10]

SoK: Agentic skills–beyond tool use in LLM agents.arXiv preprint arXiv:2602.20867, 2026

Jiang, Y ., Li, D., Deng, H., Ma, B., Wang, X., Wang, Q., and Yu, G. SoK: Agentic skills – beyond tool use in LLM agents.arXiv preprint arXiv:2602.20867,

work page arXiv
[11]

Memory os of ai agent,

Kang, J., Ji, M., Zhao, Z., and Bai, T. Memory OS of AI agent.arXiv preprint arXiv:2506.06326,

work page arXiv
[12]

DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines

Khattab, O., Singhvi, A., Maheshwari, P., Zhang, Z., San- thanam, K., Vardhamanan, S., Haq, S., Sharma, A., Joshi, T. T., Mober, H., et al. DSPy: Compiling declarative language model calls into self-improving pipelines.arXiv preprint arXiv:2310.03714,

work page internal anchor Pith review arXiv
[13]

Governing Evolving Memory in LLM Agents: Risks, Mechanisms, and the Stability and Safety Governed Memory (SSGM) Framework,

Lam, C., Li, J., Zhang, L., and Zhao, K. Governing evolving memory in LLM agents: Risks, mechanisms, and the sta- bility and safety governed memory (SSGM) framework. arXiv preprint arXiv:2603.11768,

work page arXiv
[14]

MemPO: Self-Memory Policy Optimization for Long-Horizon Agents

Li, R., Zhang, X., Yu, H., Duan, S., Li, X., Xiang, W., Liao, C., Guo, X., Li, Y ., and Suo, J. MemPO: Self-memory pol- icy optimization for long-horizon agents.arXiv preprint arXiv:2603.00680, 2026a. Li, X., Chen, W., Liu, Y ., Zheng, S., Chen, X., He, Y ., Li, Y ., You, B., Shen, H., Sun, J., et al. SkillsBench: Benchmarking how well agent skills work a...

work page internal anchor Pith review Pith/arXiv arXiv
[15]

arXiv preprint arXiv:2603.18718 , year=

Lin, M., Zhang, Z., Lu, H., Liu, H., Tang, X., He, Q., Zhang, X., and Wang, S. MemMA: Coordinating the memory cycle through multi-agent reasoning and in-situ self-evolution.arXiv preprint arXiv:2603.18718,

work page arXiv
[16]

Trace2Skill: Distill Trajectory-Local Lessons into Transferable Agent Skills

Ni, J., Liu, Y ., Liu, X., Sun, Y ., Zhou, M., Cheng, P., Wang, D., Zhao, E., Jiang, X., and Jiang, G. Trace2Skill: Dis- till trajectory-local lessons into transferable agent skills. arXiv preprint arXiv:2603.25158,

work page internal anchor Pith review Pith/arXiv arXiv
[17]

Understanding the challenges in iterative generative opti- mization with LLMs.arXiv preprint arXiv:2603.23994,

Nie, A., Daull, X., Kuang, Z., Akkiraju, A., Chaudhuri, A., Piasevoli, M., Rong, R., Yuan, Y ., Choudhary, P., Xiao, S., Fakoor, R., Swaminathan, A., and Cheng, C.-A. Understanding the challenges in iterative generative opti- mization with LLMs.arXiv preprint arXiv:2603.23994,

work page arXiv
[18]

DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

7 Experience Compression Spectrum Shao, Z., Wang, P., Zhu, Q., Xu, R., Song, J., Zhang, M., Li, Y ., Wu, Y ., and Guo, D. DeepSeekMath: Pushing the limits of mathematical reasoning in open language models.arXiv preprint arXiv:2402.03300,

work page internal anchor Pith review Pith/arXiv arXiv
[19]

Voyager: An Open-Ended Embodied Agent with Large Language Models

Wang, G., Xie, Y ., Jiang, Y ., Mandlekar, A., Xiao, C., Zhu, Y ., Fan, L., and Anandkumar, A. V oyager: An open- ended embodied agent with large language models.arXiv preprint arXiv:2305.16291,

work page internal anchor Pith review arXiv
[20]

Qingyun Wu, Gagan Bansal, Jieyu Zhang, Yiran Wu, Beibin Li, Erkang Zhu, Li Jiang, Xiaoyun Zhang, Chi Wang, Shaokun Zhang, et al

Wang, X., Liao, N., Wei, S., Tang, C., and Xiong, F. AutoAgent: Evolving cognition and elastic mem- ory orchestration for adaptive agents.arXiv preprint arXiv:2603.09716,

work page arXiv
[21]

URL https://openreview.net/forum?id=ehfRiF0R3a

Wang, Y ., Takanobu, R., Liang, Z., Mao, Y ., Hu, Y ., McAuley, J., and Wu, X. Mem- α: Learning memory construction via reinforcement learning.arXiv preprint arXiv:2509.25911,

work page arXiv
[22]

EvolveR: Self-Evolving LLM Agents through an Experience-Driven Lifecycle

Wu, R., Wang, X., Mei, J., Cai, P., Fu, D., Yang, C., Wen, L., Yang, X., Shen, Y ., Wang, Y ., and Shi, B. EvolveR: Self-evolving LLM agents through an experience-driven lifecycle.arXiv preprint arXiv:2510.16079,

work page internal anchor Pith review arXiv
[23]

SkillRL: Evolving Agents via Recursive Skill-Augmented Reinforcement Learning

Xia, P., Chen, J., Wang, H., Liu, J., Zeng, K., Wang, Y ., Han, S., Zhou, Y ., Zhao, X., Chen, H., Zheng, Z., Xie, C., and Yao, H. SkillRL: Evolving agents via recursive skill-augmented reinforcement learning.arXiv preprint arXiv:2602.08234,

work page internal anchor Pith review arXiv
[24]

Learning to continually learn via meta-learning agentic memory designs.arXiv preprint arXiv:2602.07755, 2026

Xiong, Y ., Hu, S., and Clune, J. Learning to continually learn via meta-learning agentic memory designs.arXiv preprint arXiv:2602.07755,

work page arXiv
[25]

Agent Skills for Large Language Models: Architecture, Acquisition, Security, and the Path Forward

Xu, R. and Yan, Y . Agent skills for large language models: Architecture, acquisition, security, and the path forward. arXiv preprint arXiv:2602.12430,

work page internal anchor Pith review arXiv
[26]

A-MEM: Agentic Memory for LLM Agents

Xu, W., Liang, Z., Mei, K., Gao, H., Tan, J., and Zhang, Y . A-MEM: Agentic memory for LLM agents.arXiv preprint arXiv:2502.12110,

work page internal anchor Pith review arXiv
[27]

Memory-R1: Enhancing Large Language Model Agents to Manage and Utilize Memories via Reinforcement Learning

Yan, S., Yang, X., Huang, Z., Nie, E., Ding, Z., Li, Z., Ma, X., Bi, J., Kersting, K., Pan, J. Z., Sch ¨utze, H., Tresp, V ., and Ma, Y . Memory-R1: Enhancing large language model agents to manage and utilize memories via rein- forcement learning.arXiv preprint arXiv:2508.19828,

work page internal anchor Pith review arXiv
[28]

arXiv preprint arXiv:2602.05665 , year=

Yang, C., Zhou, C., Xiao, Y ., Dong, S., Zhuang, L., Zhang, Y ., Wang, Z., Hong, Z., Yuan, Z., Xiang, Z., et al. Graph- based agent memory: Taxonomy, techniques, and appli- cations.arXiv preprint arXiv:2602.05665, 2026a. Yang, Y ., Li, J., Pan, Q., Zhan, B., Cai, Y ., Du, L., Zhou, J., Chen, K., Chen, Q., Li, X., Zhang, B., and He, L. AutoSkill: Experienc...

work page arXiv
[29]

MemSkill: Learning and Evolving Memory Skills for Self-Evolving Agents

Zhang, G., Geng, H., Yu, X., Yin, Z., Zhang, Z., Tan, Z., Zhou, H., Li, Z., Xue, X., Li, Y ., et al. The landscape of agentic reinforcement learning for LLMs: A survey. Transactions on Machine Learning Research (TMLR), 2026a. Zhang, H., Long, Q., Bao, J., Feng, T., Zhang, W., Yue, H., and Wang, W. MemSkill: Learning and evolving memory skills for self-evo...

work page internal anchor Pith review arXiv
[30]

SkillWeaver: Web Agents can Self-Improve by Discovering and Honing Skills

Zheng, B., Fatemi, M. Y ., Jin, X., Wang, Z. Z., Gandhi, A., Song, Y ., Gu, Y ., Srinivasa, J., Liu, G., Neubig, G., and Su, Y . SkillWeaver: Web agents can self-improve by discov- ering and honing skills.arXiv preprint arXiv:2504.07079,

work page internal anchor Pith review arXiv