Neural Procedural Memory: Empowering LLM Agents with Implicit Activation Steering

Chengfeng Zhao; Jun Zhao; Kang Liu; Shizhu He; Yequan Wang; Yuqiao Tan

arxiv: 2606.29824 · v1 · pith:IIWQVRGSnew · submitted 2026-06-29 · 💻 cs.CL · cs.AI

Neural Procedural Memory: Empowering LLM Agents with Implicit Activation Steering

Chengfeng Zhao , Yuqiao Tan , Shizhu He , Yequan Wang , Jun Zhao , Kang Liu This is my paper

Pith reviewed 2026-06-30 06:14 UTC · model grok-4.3

classification 💻 cs.CL cs.AI

keywords Neural Procedural Memoryactivation steeringLLM agentsprocedural memoryimplicit guidancecontrastive experiencestraining-free methods

0 comments

The pith

LLM agents can store procedural memory as activation steering vectors distilled from past experiences instead of text instructions.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes Neural Procedural Memory as a training-free method that turns historical contrastive experiences into vectors in the model's activation space. These vectors steer the LLM toward the internal mechanisms needed for a task, bypassing the need to insert explicit written guidelines into the prompt. A sympathetic reader would care because many current agent systems suffer when symbolic instructions fail to trigger the right behaviors inside the model. The approach matches the performance of text-based methods on four benchmarks and improves further when the two are combined.

Core claim

By distilling procedural skills from historical contrastive experiences into steering vectors in the activation space, NPM directly activates the task-relevant neural mechanisms to guide task execution without any model training or explicit symbolic guidance.

What carries the argument

Steering vectors created from contrastive historical experiences that encode task logic and are applied at inference time to shift activations.

If this is right

Agents achieve comparable success rates using only the implicit steering vectors as when given explicit textual workflows.
Pairing the steering vectors with explicit instructions produces higher robustness than either alone.
The vectors form consistent, organized structures in activation space that reflect task logic across examples.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same contrastive distillation process might be applied to other internal states such as attention patterns if those also encode procedural knowledge.
Steering could reduce the length of context needed for agent prompts by moving memory out of tokens and into activations.
If the activation space contains reusable task modules, the method might transfer across domains that share underlying logic.

Load-bearing premise

Vectors taken from past contrastive experiences will reliably activate the correct internal mechanisms on new tasks without training or explicit rules.

What would settle it

Apply the derived steering vector to a genuinely new task and observe whether success rate stays the same or drops relative to an unsteered baseline.

Figures

Figures reproduced from arXiv: 2606.29824 by Chengfeng Zhao, Jun Zhao, Kang Liu, Shizhu He, Yequan Wang, Yuqiao Tan.

**Figure 1.** Figure 1: Comparison between Declarative and Procedural Memory in LLM Agents. Left: Declarative memory successfully grounds reasoning in static factual knowledge, enabling the agent to strictly adhere to explicit user constraints. Right: Textual procedural memory can introduce a text-action disconnect where the agent struggles to align the retrieved workflow onto the execution trajectory and omits intermediate ste… view at source ↗

**Figure 2.** Figure 2: Overview of the Neural Procedural Memory (NPM) framework. (1) Contrastive Experience Construction: Formulating dual-granularity (inter- and intra-trajectory) contrastive pairs from historical interactions. (2) Procedural Memory Extraction: Extracting continuous representations from these pairs to construct a historical memory repository. (3) Inference-Time Intervention: Retrieving relevant experiences to d… view at source ↗

**Figure 3.** Figure 3: Performance comparison of different steering [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗

**Figure 5.** Figure 5: Geometric consistency of procedural steering [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗

**Figure 6.** Figure 6: Temporal activation of behavioral primitives across execution steps in a PickHeat task. (Left) The unsteered baseline exhibits an extended 33-step trajectory. (Middle) Inter-trajectory steering is associated with a shorter trajectory by promoting early planning and object placement. (Right) Intra-trajectory steering is observed to amplify search features and suppress premature task termination signals duri… view at source ↗

**Figure 7.** Figure 7: The impact of retrieval pool size on specific [PITH_FULL_IMAGE:figures/full_fig_p008_7.png] view at source ↗

**Figure 8.** Figure 8: Prompt template for automated interpretation of geometric basis directions. [PITH_FULL_IMAGE:figures/full_fig_p018_8.png] view at source ↗

**Figure 9.** Figure 9: Normalized entropy for inter-trajectory and intra-trajectory steering vectors across the ALFWorld benchmark. Inter Intra 0.350 0.375 0.400 0.425 0.450 0.475 0.500 0.525 Top-3 Concentration [PITH_FULL_IMAGE:figures/full_fig_p019_9.png] view at source ↗

**Figure 10.** Figure 10: Top-3 concentration for inter-trajectory [PITH_FULL_IMAGE:figures/full_fig_p019_10.png] view at source ↗

**Figure 11.** Figure 11: Example of Insights Baseline for webshop [PITH_FULL_IMAGE:figures/full_fig_p020_11.png] view at source ↗

**Figure 12.** Figure 12: Example of Workflows Baseline for AlfWorld [PITH_FULL_IMAGE:figures/full_fig_p021_12.png] view at source ↗

read the original abstract

While Large Language Models (LLMs) excel as static solvers, transforming them into autonomous agents remains challenging. This transition requires continuous environmental interaction, yet current agents lack the necessary persistent procedural memory. Existing approaches predominantly employ Retrieval-Augmented Generation (RAG) to inject explicit textual guidelines into model contexts. However, relying solely on symbolic instructions can introduce a text-action disconnect, frequently failing to activate the internal representations necessary for correct task execution. To address this, the paper introduces Neural Procedural Memory (NPM), a training-free framework that represents agent memory through implicit activation steering rather than explicit instructions. By distilling procedural skills from historical contrastive experiences into steering vectors in the activation space, NPM directly activates the task-relevant neural mechanisms to guide task execution. Evaluations across four agent benchmarks show that NPM performs comparably to baselines using explicit textual instructions. Furthermore, the results show that combining implicit steering with explicit workflows provides complementary advantages, leading to more robust task execution. Representational analyses indicate that these steering vectors encode consistent task logic, forming organized structures within the activation space. These findings suggest that implicit activation steering provides a promising approach for managing agent memory.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

NPM frames agent memory as contrastive steering vectors with benchmark parity to RAG, but the causal activation claim rests on correlation without targeted interventions.

read the letter

The main thing here is that the paper introduces Neural Procedural Memory as a training-free way to turn historical contrastive experiences into activation steering vectors that guide LLM agents, claiming this implicit approach matches explicit RAG baselines on four benchmarks while also showing some organized structure in the activation space.

What is actually new is the explicit move away from symbolic instructions toward direct activation steering as a form of procedural memory. The work does a reasonable job documenting that the combined implicit-plus-explicit setup improves robustness and that the vectors appear to encode consistent task logic in the representational analyses.

The soft spot is the missing causal link. The claim that these vectors reliably activate the right internal mechanisms for unseen tasks is supported mainly by performance correlation and the representational patterns, but the paper does not appear to include layer-specific interventions, ablation on the contrastive construction, or systematic failure analysis on out-of-distribution tasks. Without those, other explanations such as generic regularization effects remain possible. The abstract also gives little detail on vector extraction, statistical reporting, or dataset construction, which limits how firmly the results can be assessed.

This is for researchers working on LLM agents who are already exploring activation engineering or alternatives to RAG. Someone looking for a practical new memory mechanism might extract useful ideas from the empirical comparisons, while a reader wanting mechanistic proof would find the current evidence preliminary.

I would send it to peer review. The framing is distinct enough and the benchmark results are concrete enough that referees could usefully pressure the causal claims and ask for the missing controls.

Referee Report

1 major / 2 minor

Summary. The paper introduces Neural Procedural Memory (NPM), a training-free method that distills procedural skills from historical contrastive experiences into activation-space steering vectors. These vectors are claimed to implicitly activate task-relevant neural mechanisms in LLMs, enabling agent memory without explicit textual instructions. Evaluations on four agent benchmarks show performance comparable to explicit-instruction baselines, with complementary gains when combined; representational analyses indicate the vectors encode consistent task logic in organized activation-space structures.

Significance. If the central claim holds, NPM offers a novel implicit alternative to RAG-style explicit memory for LLM agents, potentially reducing text-action disconnects. The training-free nature and use of contrastive historical data are strengths, as is the reported complementarity with explicit workflows and the representational evidence of structured encodings.

major comments (1)

[Abstract and §4] Abstract and §4 (Experiments): The core claim that steering vectors 'directly activate the task-relevant neural mechanisms' is load-bearing but rests on performance correlation and representational similarity analyses. No targeted causal interventions (e.g., ablating or steering only at layers known to implement specific computations) or systematic failure-case analysis on out-of-distribution tasks are described, leaving open whether effects arise from the claimed mechanism or from generic regularization/prompt-like influences. This weakens the distinction from explicit baselines.

minor comments (2)

[Abstract] Abstract: The four agent benchmarks are not named, and no dataset sizes, statistical significance, or variance reporting appear; adding these would improve clarity without altering the claim.
[§3] §3 (Method): The precise construction of contrastive pairs and the layer(s) at which steering is applied should be stated explicitly with pseudocode or equations to allow reproduction.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback on the strength of evidence for our mechanistic claims. We address the single major comment point by point below.

read point-by-point responses

Referee: [Abstract and §4] Abstract and §4 (Experiments): The core claim that steering vectors 'directly activate the task-relevant neural mechanisms' is load-bearing but rests on performance correlation and representational similarity analyses. No targeted causal interventions (e.g., ablating or steering only at layers known to implement specific computations) or systematic failure-case analysis on out-of-distribution tasks are described, leaving open whether effects arise from the claimed mechanism or from generic regularization/prompt-like influences. This weakens the distinction from explicit baselines.

Authors: We agree that the evidence presented is correlational, relying on performance parity with explicit-instruction baselines, complementarity when combined, and representational similarity showing organized task-logic structures in activation space. No layer-specific causal ablations or systematic OOD failure analyses are included, as the work focuses on a training-free distillation approach without requiring prior identification of computation-specific layers. The contrastive construction of the vectors is designed to capture task-specific rather than generic effects, but we acknowledge this does not constitute direct causal proof and that alternative interpretations (e.g., regularization-like influences) cannot be fully ruled out with the current analyses. For the revision we will (1) revise the abstract and §4 language to describe the vectors as implicitly guiding execution via distilled activations rather than claiming they 'directly activate' mechanisms, (2) expand the representational analysis section with additional failure-case breakdowns from the existing benchmark results, and (3) add an explicit limitations paragraph discussing the correlational nature of the evidence and the value of future causal work. These changes will be made without new experiments. revision: partial

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper introduces NPM as a training-free method that constructs steering vectors from external historical contrastive experiences and evaluates them on agent benchmarks plus representational analyses. No equations, fitted parameters renamed as predictions, self-citations used as load-bearing uniqueness theorems, or ansatzes smuggled via prior work appear in the provided text. The central claims rest on empirical performance comparisons and activation-space observations that are independent of the method's definition, rendering the derivation self-contained.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only view supplies no identifiable free parameters, axioms, or invented entities.

pith-pipeline@v0.9.1-grok · 5741 in / 906 out tokens · 23985 ms · 2026-06-30T06:14:57.280795+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

12 extracted references · 6 canonical work pages · 3 internal anchors

[1]

Preprint, arXiv:2510.04851

Legomem: Modular procedural memory for multi-agent llm systems for workflow automation. Preprint, arXiv:2510.04851. Shengding Hu, Yuge Tu, Xu Han, Chaoqun He, Ganqu Cui, Xiang Long, Zhi Zheng, Yewei Fang, Yuxiang Huang, Weilin Zhao, and 1 others. 2024. Minicpm: Unveiling the potential of small language models with scalable training strategies.arXiv prepri...

work page arXiv 2024
[2]

Memory in the Age of AI Agents

Memory in the age of ai agents.Preprint, arXiv:2512.13564. Taewoon Kim, Michael Cochez, Vincent François- Lavet, Mark Neerincx, and Piek V ossen. 2023. A machine with short-term, episodic, and semantic memory systems. InProceedings of the Thirty- Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applica- tions of...

work page internal anchor Pith review Pith/arXiv arXiv 2023
[3]

Charles Packer, Sarah Wooders, Kevin Lin, Vivian Fang, Shishir G

Memory-augmented transformers: A system- atic review from neuroscience principles to enhanced model architectures.Preprint, arXiv:2508.10824. Charles Packer, Sarah Wooders, Kevin Lin, Vivian Fang, Shishir G. Patil, Ion Stoica, and Joseph E. Gonzalez

work page arXiv
[4]

MemGPT: Towards LLMs as Operating Systems

Memgpt: Towards llms as operating systems. Preprint, arXiv:2310.08560. Nina Panickssery, Nick Gabrieli, Julian Schulz, Meg Tong, Evan Hubinger, and Alexander Matt Turner

work page internal anchor Pith review Pith/arXiv arXiv
[5]

Steering Llama 2 via Contrastive Activation Addition

Steering llama 2 via contrastive activation addition.Preprint, arXiv:2312.06681. Joon Sung Park, Joseph O’Brien, Carrie Jun Cai, Mered- ith Ringel Morris, Percy Liang, and Michael S. Bern- stein. 2023. Generative agents: Interactive simulacra of human behavior. InProceedings of the 36th An- nual ACM Symposium on User Interface Software and Technology, UIS...

work page internal anchor Pith review Pith/arXiv arXiv 2023
[6]

InProceed- ings of the 7th BlackboxNLP Workshop: Analyzing and Interpreting Neural Networks for NLP, pages 577–603, Miami, Florida, US

Multi-property steering of large language mod- els with dynamic activation composition. InProceed- ings of the 7th BlackboxNLP Workshop: Analyzing and Interpreting Neural Networks for NLP, pages 577–603, Miami, Florida, US. Association for Com- putational Linguistics. Yunfan Shao, Linyang Li, Junqi Dai, and Xipeng Qiu
[7]

The illusion of diminishing returns: Measuring long horizon execution in llms.arXiv preprint arXiv:2509.09677,

Character-LLM: A trainable agent for role- playing. InProceedings of the 2023 Conference on Empirical Methods in Natural Language Process- ing, pages 13153–13187, Singapore. Association for Computational Linguistics. Noah Shinn, Federico Cassano, Ashwin Gopinath, Karthik R Narasimhan, and Shunyu Yao. 2023. Re- flexion: language agents with verbal reinforc...

work page arXiv 2023
[8]

Identify the common behavioral pattern across the activating examples that distinguishes them from the non-activating examples
[9]

The label should reflect both WHAT the behavior is and WHETHER it contributes to task success or failure

Provide a short label (3-8 words in English) for this feature. The label should reflect both WHAT the behavior is and WHETHER it contributes to task success or failure
[10]

Provide a one-paragraph explanation of what this feature detects, integrating evidence from the examples, outcome statistics, and action type distribution
[11]

positive

Classify the feature’s polarity: "positive" if it is associated with effective/successful behavior, "negative" if associated with ineffective/failing behavior, or "neutral" if no clear association
[12]

label":

Rate your confidence (high/medium/low) in this interpretation. Respond in JSON format: {"label": "...", "explanation": "...", "polarity": "positive | negative | neutral", "confidence": "high | medium | low", "key_evidence": ["...", "..."]} Figure 8: Prompt template for automated interpretation of geometric basis directions. 18 Inter Intra 0.84 0.86 0.88 0...

[1] [1]

Preprint, arXiv:2510.04851

Legomem: Modular procedural memory for multi-agent llm systems for workflow automation. Preprint, arXiv:2510.04851. Shengding Hu, Yuge Tu, Xu Han, Chaoqun He, Ganqu Cui, Xiang Long, Zhi Zheng, Yewei Fang, Yuxiang Huang, Weilin Zhao, and 1 others. 2024. Minicpm: Unveiling the potential of small language models with scalable training strategies.arXiv prepri...

work page arXiv 2024

[2] [2]

Memory in the Age of AI Agents

Memory in the age of ai agents.Preprint, arXiv:2512.13564. Taewoon Kim, Michael Cochez, Vincent François- Lavet, Mark Neerincx, and Piek V ossen. 2023. A machine with short-term, episodic, and semantic memory systems. InProceedings of the Thirty- Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applica- tions of...

work page internal anchor Pith review Pith/arXiv arXiv 2023

[3] [3]

Charles Packer, Sarah Wooders, Kevin Lin, Vivian Fang, Shishir G

Memory-augmented transformers: A system- atic review from neuroscience principles to enhanced model architectures.Preprint, arXiv:2508.10824. Charles Packer, Sarah Wooders, Kevin Lin, Vivian Fang, Shishir G. Patil, Ion Stoica, and Joseph E. Gonzalez

work page arXiv

[4] [4]

MemGPT: Towards LLMs as Operating Systems

Memgpt: Towards llms as operating systems. Preprint, arXiv:2310.08560. Nina Panickssery, Nick Gabrieli, Julian Schulz, Meg Tong, Evan Hubinger, and Alexander Matt Turner

work page internal anchor Pith review Pith/arXiv arXiv

[5] [5]

Steering Llama 2 via Contrastive Activation Addition

Steering llama 2 via contrastive activation addition.Preprint, arXiv:2312.06681. Joon Sung Park, Joseph O’Brien, Carrie Jun Cai, Mered- ith Ringel Morris, Percy Liang, and Michael S. Bern- stein. 2023. Generative agents: Interactive simulacra of human behavior. InProceedings of the 36th An- nual ACM Symposium on User Interface Software and Technology, UIS...

work page internal anchor Pith review Pith/arXiv arXiv 2023

[6] [6]

InProceed- ings of the 7th BlackboxNLP Workshop: Analyzing and Interpreting Neural Networks for NLP, pages 577–603, Miami, Florida, US

Multi-property steering of large language mod- els with dynamic activation composition. InProceed- ings of the 7th BlackboxNLP Workshop: Analyzing and Interpreting Neural Networks for NLP, pages 577–603, Miami, Florida, US. Association for Com- putational Linguistics. Yunfan Shao, Linyang Li, Junqi Dai, and Xipeng Qiu

[7] [7]

The illusion of diminishing returns: Measuring long horizon execution in llms.arXiv preprint arXiv:2509.09677,

Character-LLM: A trainable agent for role- playing. InProceedings of the 2023 Conference on Empirical Methods in Natural Language Process- ing, pages 13153–13187, Singapore. Association for Computational Linguistics. Noah Shinn, Federico Cassano, Ashwin Gopinath, Karthik R Narasimhan, and Shunyu Yao. 2023. Re- flexion: language agents with verbal reinforc...

work page arXiv 2023

[8] [8]

Identify the common behavioral pattern across the activating examples that distinguishes them from the non-activating examples

[9] [9]

The label should reflect both WHAT the behavior is and WHETHER it contributes to task success or failure

Provide a short label (3-8 words in English) for this feature. The label should reflect both WHAT the behavior is and WHETHER it contributes to task success or failure

[10] [10]

Provide a one-paragraph explanation of what this feature detects, integrating evidence from the examples, outcome statistics, and action type distribution

[11] [11]

positive

Classify the feature’s polarity: "positive" if it is associated with effective/successful behavior, "negative" if associated with ineffective/failing behavior, or "neutral" if no clear association

[12] [12]

label":

Rate your confidence (high/medium/low) in this interpretation. Respond in JSON format: {"label": "...", "explanation": "...", "polarity": "positive | negative | neutral", "confidence": "high | medium | low", "key_evidence": ["...", "..."]} Figure 8: Prompt template for automated interpretation of geometric basis directions. 18 Inter Intra 0.84 0.86 0.88 0...