MemMorph: Tool Hijacking in LLM Agents via Memory Poisoning

Bowen Shen; Haoran Ou; Kaiyu Zhou; Kwok-Yan Lam; Tianwei Zhang; Xuanye Zhang; Yongsen Zheng; Zhuqin Xu

arxiv: 2605.26154 · v1 · pith:4BMHFWMQnew · submitted 2026-05-24 · 💻 cs.CR · cs.AI

MemMorph: Tool Hijacking in LLM Agents via Memory Poisoning

Xuanye Zhang , Yongsen Zheng , Zhuqin Xu , Kaiyu Zhou , Bowen Shen , Haoran Ou , Tianwei Zhang , Kwok-Yan Lam This is my paper

classification 💻 cs.CR cs.AI

keywords toolagentsmemmorphmemoryagentattackrecordslong-term

0 comments

read the original abstract

LLM-driven agents are capable of selecting external tools to complete users' tasks. However, attackers could compromise such process, steering agents toward inappropriate/wrong tools and enabling malicious actions. Most existing attacks primarily manipulate the tool metadata, which is easily detectable by auditing and may lose effectiveness as modern agents increasingly adopt memory modules to refine tool selection policies through accumulated experience. This paper proposes MemMorph, the first attack that bias tool selection by poisoning the agent's long-term memory. Rather than explicitly dictating the tool invocation decision, MemMorph injects a small number of crafted records that are disguised as technical facts, incident reports, and operational policies. These poisoned records reshape the agent's contextual perception and decision-making process, leading it to autonomously infer and select the tool preferred by the attacker. Experiments across 3 benchmarks, 10 agent backbones, and 3 memory-module implementations show that MemMorph achieves up to 85.9% attack success rate with only three injected records, outperforming the strongest baseline by up to 25% while retaining potency under 3 representative defenses. Our findings expose long-term memory as a critical and under-explored attack surface in tool-augmented agents, urging the development of memory-level integrity safeguards.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Securing LLM-Agent Long-Term Memory Against Poisoning: Non-Malleable, Origin-Bound Authority with Machine-Checked Guarantees
cs.CR 2026-06 unverdicted novelty 7.0 partial

Presents TMA-NM, a non-malleable origin-bound authority system for LLM-agent memory with TLA+ machine-checked separation theorems and benchmarks showing 0% attack success against direct and laundering poisoning while ...