DELTAMEM: Incremental Experience Memory for LLM Agents via Residual Trees

Haoran Tan; Rui Li; Xu Chen; Zeyu Zhang; Zhicheng Cao

arxiv: 2606.03083 · v1 · pith:ZT54FHSKnew · submitted 2026-06-02 · 💻 cs.AI

DELTAMEM: Incremental Experience Memory for LLM Agents via Residual Trees

Haoran Tan , Zeyu Zhang , Zhicheng Cao , Rui Li , Xu Chen This is my paper

Pith reviewed 2026-06-28 10:44 UTC · model grok-4.3

classification 💻 cs.AI

keywords LLM agentsexperience memoryresidual treesincremental learningmemory managementcontinual interactionskill reuse

0 comments

The pith

DeltaMem stores LLM agent experiences in residual trees as incremental deltas from shared roots to cut redundancy and conflicts.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper claims that flat storage of agent experiences creates redundancy and contradictory retrieval when similar episodes recur with small variations. It proposes residual experience as the organizing principle: new episodes are treated as deltas added to existing bases rather than independent records. Two separate trees are built, one capturing goal-conditioned skills and the other scene-level environment facts, each with a root holding the generalized base and delta nodes holding successive changes. Retrieval finds the closest match with a failure-penalized scan and rebuilds the full memory by composing the root-to-delta chain. A consolidation step periodically turns high-frequency paths into new roots so the structure evolves toward specialization. If correct, this yields memory that grows more slowly while supplying consistent guidance across repeated interactions.

Core claim

DeltaMem maintains two independent residual trees, one for goal-conditioned task experience and one for scene-level environment knowledge. Each tree stores generalized base experiences at the root and incremental variations as delta nodes; related episodes therefore share a common foundation without duplication. Retrieval locates the best-matching node via failure-penalized similarity and reconstructs the complete experience by composing the chain from root to that node. An autonomous consolidation mechanism distills high-frequency paths into new root nodes, allowing the trees to self-organize from general heuristics toward specialized variants.

What carries the argument

Residual trees whose root nodes hold base experiences and whose delta nodes hold incremental variations, with chain composition used to reconstruct full memories on retrieval.

If this is right

Memory size grows sublinearly with the number of episodes because many experiences share root foundations.
Retrieval conflicts decrease because structurally related experiences are retrieved through the same chain rather than as independent conflicting units.
The memory structure improves over time as frequent paths become new roots without external supervision.
Agents maintain consistent guidance across repeated tasks even when surface details vary slightly.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same residual-tree approach could be tested in non-LLM sequential learners such as reinforcement-learning agents that also accumulate experience over long horizons.
If the consolidation step reliably identifies reusable sub-structures, it might be applied to compress other forms of episodic memory such as dialogue histories or robot trajectories.
Scalability questions arise for interaction lengths much longer than those tested, where the depth of delta chains could affect reconstruction speed.

Load-bearing premise

New experiences can be represented as clean incremental deltas from existing ones without information loss or retrieval errors.

What would settle it

A controlled test in which a newly added experience is stored as a delta yet the reconstructed memory differs from the original in a detail that changes the agent's next action.

Figures

Figures reproduced from arXiv: 2606.03083 by Haoran Tan, Rui Li, Xu Chen, Zeyu Zhang, Zhicheng Cao.

**Figure 1.** Figure 1: Three memory storage paradigms. Left: flat trajectory storage. Middle: compact workflow/insight extraction. Right: DELTAMEM’s residual tree with a generalized base experience and incremental ∆-nodes. tackling long-horizon, complex sequential decisionmaking tasks across domains like web navigation, embodied manipulation, and scientific experimentation (Tan et al., 2026). However, the inherent stateless na… view at source ↗

**Figure 2.** Figure 2: Overview of DELTAMEM. Top: End-to-end pipeline. A task instruction triggers retrieval from the dual trees, the reconstructed memory context is injected into the agent for environment interaction, and the resulting experience is extracted back into the trees via online learning. Bottom-left: Tree details showing node content structure, global search across all trees, and the reconstructed root-to-match path… view at source ↗

**Figure 3.** Figure 3: Autonomous memory consolidation in DELTAMEM. Residual nodes accumulate with success-hit counts (Phase 1); upon reaching Kcons, an LLM fuses the chain into a new root node R∗ (Phase 2); subsequent queries match R∗ directly, bypassing chain traversal (Phase 3). Rather than relying on a separately generated environment description, the Env-Tree query is obtained by extracting the environment relevant porti… view at source ↗

**Figure 4.** Figure 4: Performance across a grid of (τ task base, τ env base) under four Kcons levels for ALFWorld (top) and SciWorld (bottom). Gold box marks the per-subplot optimum for each consolidation setting [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗

**Figure 5.** Figure 5: Node distribution by tree depth under different parameter settings. Top row: ALFWorld; bottom row: [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗

read the original abstract

Large Language Model (LLM)-based agents increasingly rely on memory to learn from experiences over continual interactions. However, storing experiences as independent, flat units leads to substantial redundancy and retrieval conflicts, as similar episodes repeat overlapping content and subtle scene variations cause retrieved memories to offer contradictory guidance. To address this, we introduce residual experience, positing that newly acquired experience is often an incremental variation of existing knowledge. We propose DeltaMem, a framework that organizes experience memory into two independent residual trees, one storing goal-conditioned task experience as reusable skills and another for scene-level environment knowledge. Each tree uses a root node for generalized base experiences and incremental delta nodes for subsequent variations, allowing related experiences to share a common foundation without duplication. For retrieval, a failure-penalized similarity scan locates the best match, reconstructing the full experience via root-to-match chain composition. An autonomous consolidation mechanism distills high-frequency paths into new root nodes, enabling the trees to self-organize from general heuristics to specialized variants. Experiments across diverse interactive environments show that DeltaMem consistently outperforms existing baselines. To facilitate future research, we release the code at https://github.com/import-myself/DeltaMem.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

DeltaMem uses residual trees to share base experiences and store only deltas for LLM agent memory, which targets a real redundancy problem but leaves the lossless reconstruction claim unexamined.

read the letter

DeltaMem organizes experience memory for LLM agents into residual trees to avoid the redundancy and conflicts that come with storing each episode separately. There are two trees: one for reusable skills tied to goals and one for scene knowledge. Roots hold the general versions and delta nodes add the variations, so common parts are not duplicated. Retrieval uses a similarity scan that penalizes failures to pick the best match, then composes the full memory from the root down the chain. An autonomous process turns frequent paths into new roots to keep the trees organized.

This is new in the tree structure and the consolidation mechanism. It does a solid job of identifying how overlapping experiences cause problems in agent memory and proposing a way to share foundations.

The soft spots are around the assumption that experiences are always incremental enough for clean deltas. The abstract does not detail how deltas are constructed or demonstrate that the composition step always works without errors or loss. If the variations are more complex, the outperformance over baselines might not come through. The experiments are described only at a high level, so it's hard to tell how strong the evidence is.

This paper is for researchers focused on building better memory for long-running LLM agents. Readers interested in continual learning setups would find the framework worth looking at. It deserves peer review to see if the full methods and results back up the claims.

Referee Report

2 major / 0 minor

Summary. The paper introduces DeltaMem, a framework for LLM agent memory that represents experiences as residual trees: one tree for goal-conditioned task skills and another for scene-level knowledge. New experiences are stored as incremental delta nodes attached to generalized root nodes, with retrieval performed via a failure-penalized similarity scan followed by root-to-match chain composition to reconstruct full experiences. An autonomous consolidation step distills frequent paths into new roots. The abstract states that experiments across diverse interactive environments show consistent outperformance over baselines, and the code is released at the provided GitHub link.

Significance. If the central claims hold, the residual-tree approach could meaningfully reduce memory redundancy and retrieval conflicts in continual agent learning, offering a structured alternative to flat memory stores. The explicit release of code is a clear strength that supports reproducibility and extension by the community.

major comments (2)

[Abstract] Abstract: the claim that residual trees enable 'lossless' representation of incremental variations without retrieval errors is load-bearing for the outperformance result, yet the manuscript provides no formal definition of delta construction, no invertibility argument for the root-to-match chain, and no ablation isolating the tree mechanism from other components.
[Abstract] Abstract: the statement that DeltaMem 'consistently outperforms existing baselines' is the primary empirical claim, but the abstract supplies no environments, metrics, baselines, run counts, or statistical details, leaving the soundness of the result impossible to assess from the provided text.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on the abstract. We address the two major comments below and will make revisions to strengthen the presentation of formal aspects and empirical details.

read point-by-point responses

Referee: [Abstract] Abstract: the claim that residual trees enable 'lossless' representation of incremental variations without retrieval errors is load-bearing for the outperformance result, yet the manuscript provides no formal definition of delta construction, no invertibility argument for the root-to-match chain, and no ablation isolating the tree mechanism from other components.

Authors: Section 3 of the manuscript defines residual deltas as incremental variations from shared root nodes and describes the root-to-match chain composition for reconstruction. The design ensures invertibility by sequential application of deltas. We agree, however, that an explicit formal argument for invertibility and an ablation isolating the residual-tree structure from components such as failure-penalized retrieval would strengthen the claims. We will add a concise formal definition and invertibility sketch to the methods section and include a targeted ablation in the experiments. revision: yes
Referee: [Abstract] Abstract: the statement that DeltaMem 'consistently outperforms existing baselines' is the primary empirical claim, but the abstract supplies no environments, metrics, baselines, run counts, or statistical details, leaving the soundness of the result impossible to assess from the provided text.

Authors: The full manuscript (Section 5) reports experiments across multiple interactive environments, using task success rate and efficiency metrics, against baselines such as flat memory stores and prior LLM-agent memory methods, with results averaged over repeated runs and accompanied by statistical analysis. We agree the abstract would be more informative with these details. We will revise the abstract to concisely note the environments, primary metrics, and that outperformance is observed with statistical support across runs. revision: yes

Circularity Check

0 steps flagged

No circularity; framework is self-contained design choice with empirical claims

full rationale

The paper introduces DeltaMem as an architectural framework based on the posited assumption of incremental experience variations, organized into residual trees with root-to-match reconstruction. No equations, fitted parameters, predictions, or derivation steps are present that reduce to inputs by construction. No self-citations or uniqueness theorems are invoked. The central claims rest on the proposed mechanism and experimental outperformance, which are independent of any circular reduction. This matches the default expectation of no significant circularity.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 2 invented entities

The framework rests on the domain assumption that experiences are incremental variations suitable for tree compression; no free parameters or invented physical entities are described.

axioms (1)

domain assumption Newly acquired experience is often an incremental variation of existing knowledge.
Stated directly in abstract as the basis for residual experience.

invented entities (2)

residual experience no independent evidence
purpose: To represent incremental variations instead of full duplicate memories.
Core new concept introduced to address redundancy.
residual trees no independent evidence
purpose: To organize experiences with shared roots and delta nodes.
Structural invention for the memory framework.

pith-pipeline@v0.9.1-grok · 5743 in / 1159 out tokens · 24401 ms · 2026-06-28T10:44:35.316970+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

15 extracted references · 2 canonical work pages · 2 internal anchors

[1]

ALFWorld: Aligning Text and Embodied Environments for Interactive Learning

From isolated conversations to hierarchical schemas: Dynamic tree memory representation for llms. InInternational Conference on Learning Rep- resentations, volume 2025, pages 990–1023. Noah Shinn, Federico Cassano, Ashwin Gopinath, Karthik Narasimhan, and Shunyu Yao. 2023. Re- flexion: Language agents with verbal reinforcement learning.Advances in neural ...

work page internal anchor Pith review Pith/arXiv arXiv 2025
[2]

ReAct: Synergizing Reasoning and Acting in Language Models

Tree of thoughts: Deliberate problem solving with large language models.Advances in neural information processing systems, 36:11809–11822. Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik Narasimhan, and Yuan Cao. 2022b. React: Synergizing reasoning and acting in language models.arXiv preprint arXiv:2210.03629. Zeyu Zhang, Quanyu Dai, Xia...

work page internal anchor Pith review Pith/arXiv arXiv 2025
[3]

take {obj} from {recep}
[4]

put {obj} in/on {recep}
[5]

toggle {obj} {recep}
[6]

clean {obj} with {recep}
[7]

heat {obj} with {recep}
[8]

Nothing happened

cool {obj} with {recep} where {obj} and {recep} correspond to objects and receptacles. After your each turn, the environment will give you immediate feedback. If the environment outputs "Nothing happened", the previous action is invalid -- try more options. Reminder:
[9]

The action must be chosen from the given available actions
[10]

Thought" or

Think when necessary, try to act directly more in the process. ScienceWorld Agent Instruction You are a helpful assistant to do some scientific experiment in an environment. In the environment, there are several rooms: kitchen, foundry, workshop, bathroom, outside, living room, bedroom, greenhouse, art studio, hallway. You should explore the environment a...
[11]

Identify ONE existing skill whose execution_procedure explicitly lists every distinct action type in this trajectory -- quote the exact phrase from that skill for each action
[12]

skip": true} OR {

No action required a recovery step, different object category, or procedural order not covered by that quoted text. If you cannot quote matching text for even ONE action, you MUST write a delta. Otherwise, output the smallest delta (1-3 new observations max): - ‘activation_condition‘: The specific new condition that makes this delta necessary -- must diff...
[13]

Reward is 1.0 (full success)
[14]

Identify ONE existing skill whose execution_procedure explicitly lists every distinct action performed -- quote the exact phrase
[15]

skip": true} OR {

No action required a recovery step, different item, or procedural order not covered by that quoted text. If ANY condition fails, you MUST write a delta. Otherwise, output the smallest delta (1-3 new observations max): - ‘activation_condition‘: The SPECIFIC NEW condition for this delta. - ‘execution_procedure‘: NEW steps/rules only, using exact ScienceWorl...

[1] [1]

ALFWorld: Aligning Text and Embodied Environments for Interactive Learning

From isolated conversations to hierarchical schemas: Dynamic tree memory representation for llms. InInternational Conference on Learning Rep- resentations, volume 2025, pages 990–1023. Noah Shinn, Federico Cassano, Ashwin Gopinath, Karthik Narasimhan, and Shunyu Yao. 2023. Re- flexion: Language agents with verbal reinforcement learning.Advances in neural ...

work page internal anchor Pith review Pith/arXiv arXiv 2025

[2] [2]

ReAct: Synergizing Reasoning and Acting in Language Models

Tree of thoughts: Deliberate problem solving with large language models.Advances in neural information processing systems, 36:11809–11822. Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik Narasimhan, and Yuan Cao. 2022b. React: Synergizing reasoning and acting in language models.arXiv preprint arXiv:2210.03629. Zeyu Zhang, Quanyu Dai, Xia...

work page internal anchor Pith review Pith/arXiv arXiv 2025

[3] [3]

take {obj} from {recep}

[4] [4]

put {obj} in/on {recep}

[5] [5]

toggle {obj} {recep}

[6] [6]

clean {obj} with {recep}

[7] [7]

heat {obj} with {recep}

[8] [8]

Nothing happened

cool {obj} with {recep} where {obj} and {recep} correspond to objects and receptacles. After your each turn, the environment will give you immediate feedback. If the environment outputs "Nothing happened", the previous action is invalid -- try more options. Reminder:

[9] [9]

The action must be chosen from the given available actions

[10] [10]

Thought" or

Think when necessary, try to act directly more in the process. ScienceWorld Agent Instruction You are a helpful assistant to do some scientific experiment in an environment. In the environment, there are several rooms: kitchen, foundry, workshop, bathroom, outside, living room, bedroom, greenhouse, art studio, hallway. You should explore the environment a...

[11] [11]

Identify ONE existing skill whose execution_procedure explicitly lists every distinct action type in this trajectory -- quote the exact phrase from that skill for each action

[12] [12]

skip": true} OR {

No action required a recovery step, different object category, or procedural order not covered by that quoted text. If you cannot quote matching text for even ONE action, you MUST write a delta. Otherwise, output the smallest delta (1-3 new observations max): - ‘activation_condition‘: The specific new condition that makes this delta necessary -- must diff...

[13] [13]

Reward is 1.0 (full success)

[14] [14]

Identify ONE existing skill whose execution_procedure explicitly lists every distinct action performed -- quote the exact phrase

[15] [15]

skip": true} OR {

No action required a recovery step, different item, or procedural order not covered by that quoted text. If ANY condition fails, you MUST write a delta. Otherwise, output the smallest delta (1-3 new observations max): - ‘activation_condition‘: The SPECIFIC NEW condition for this delta. - ‘execution_procedure‘: NEW steps/rules only, using exact ScienceWorl...