pith:IPIU45KI
Evaluating Memory in LLM Agents via Incremental Multi-Turn Interactions
A new benchmark shows current LLM memory agents fall short on four core competencies from cognitive science.
arxiv:2507.05257 v3 · 2025-07-07 · cs.CL · cs.AI
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{IPIU45KI5Y5THLIDAAODW5VMD2}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
Empirical results reveal that current methods fall short of mastering all four competencies, underscoring the need for further research into comprehensive memory mechanisms for LLM agents.
That the four competencies drawn from memory science are the complete and essential set for memory agents, and that transforming static long-context datasets into incremental multi-turn interactions preserves the original properties needed to measure those competencies.
MemoryAgentBench is a new multi-turn benchmark assessing four memory competencies in LLM agents—accurate retrieval, test-time learning, long-range understanding, and selective forgetting—showing that existing methods fall short.
References
Cited by
Receipt and verification
| First computed | 2026-05-17T23:38:46.539410Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
43d14e7548ee3b33ad03001c3b76ac1e8913fe2aa03e3d2c7b29b25761351ca7
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/IPIU45KI5Y5THLIDAAODW5VMD2 \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 43d14e7548ee3b33ad03001c3b76ac1e8913fe2aa03e3d2c7b29b25761351ca7
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "d5575774f38f003f816bf127894567467356de2653671037fbbfcbeba78a730e",
"cross_cats_sorted": [
"cs.AI"
],
"license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
"primary_cat": "cs.CL",
"submitted_at": "2025-07-07T17:59:54Z",
"title_canon_sha256": "a140b706cb55ff33ce6a93ec468408a531bbaab950f09d3b67bb9b418811dac5"
},
"schema_version": "1.0",
"source": {
"id": "2507.05257",
"kind": "arxiv",
"version": 3
}
}