memorywire: A Vendor-Neutral Wire Format for Agent Memory Operations
Pith reviewed 2026-06-28 17:07 UTC · model grok-4.3
The pith
A JSON-Schema wire format standardizes five memory operations over four memory types for use across agent frameworks.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
memorywire is a JSON-Schema 2020-12 wire format for five memory operations over four memory types, equipped with a MemoryStore interface, a fan-out router, and an optional HITL governance channel, that can be mapped to the internal models of existing agent-memory frameworks through adapters and validated by performance and conformance measurements.
What carries the argument
The memorywire JSON-Schema 2020-12 specification, which encodes the five operations and four memory types as a common operational vocabulary.
If this is right
- Agent frameworks can adopt a shared interface for memory writes and reads instead of maintaining separate SDKs.
- Memory data can migrate between frameworks without reconstruction from raw sources.
- Human review of proposed memory writes becomes available through the optional governance channel.
- Reciprocal rank fusion of results from multiple backends preserves recall@5 = 1.000 where simple max fusion drops to 0.500.
Where Pith is reading between the lines
- The format could serve as an underlying layer that higher-level agent protocols compose with rather than replace.
- New memory types or operations could be added as optional extensions if future frameworks require them.
- The observed ingest and recall latencies suggest the format remains practical for agents that issue frequent memory calls.
Load-bearing premise
The five operations and four memory types form a complete vocabulary that maps without functional loss onto the internal models of the listed frameworks.
What would settle it
A required memory operation or memory type from one of the six frameworks that cannot be expressed using the five operations and four types defined in memorywire.
Figures
read the original abstract
Agent-memory frameworks -- mem0, Letta/MemGPT, Cognee, Zep/Graphiti, MemoryOS, MemTensor -- each ship their own SDK, storage layout, and operational vocabulary. There is no shared wire format: every integration is bespoke, every migration rebuilds memory from scratch, and no framework ships a governance surface that lets a human review writes before they enter long-term storage. We present memorywire, a JSON-Schema 2020-12 wire format for five memory operations (remember, recall, forget, merge, expire) over four memory types (semantic, episodic, procedural, emotional), with a MemoryStore interface, a fan-out router, and an optional HITL governance channel. We describe an open-source reference implementation with five backend adapters (sqlite-vec, mem0, Letta, Cognee, pgvector); a microbenchmark on a 100-fact / 50-query labelled corpus (42 with non-empty gold ids + 8 no-match probes) achieving recall@5 = 1.000 on the 42 gold-id queries with ingest p50 = 37.8 ms and recall p50 = 40.6 ms; an adversarial-fusion experiment showing Reciprocal Rank Fusion holds recall@5 = 1.000 across a 1-of-N rank-0 injection sweep (K in {0, 5, ..., 50}) where max fusion collapses to 0.500 with 80% leak at K >= 5; and a 16-scenario cross-adapter conformance suite passing 68 of 80 cells with zero failures. The contribution is not a new algorithm; it is a packaging of established components (RRF, FSMs, STM/LTM consolidation, diff-and-approve workflows) into a venue-neutral protocol with an empirically validated reference, positioned to compose with the Model Context Protocol rather than compete with it.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces memorywire, a JSON-Schema 2020-12 wire format for five memory operations (remember, recall, forget, merge, expire) over four memory types (semantic, episodic, procedural, emotional), together with a MemoryStore interface, fan-out router, and optional HITL governance channel. It supplies an open-source reference implementation with five backend adapters, a microbenchmark achieving recall@5 = 1.000 on 42 gold queries from a 100-fact/50-query corpus, an RRF-vs-max-fusion adversarial experiment, and a 16-scenario cross-adapter conformance suite that passes 68 of 80 cells with zero failures.
Significance. If the interoperability and lossless-mapping claims hold, the work supplies a concrete, vendor-neutral protocol that could materially reduce bespoke integrations and rebuilds across agent-memory frameworks. The open-source reference implementation, the concrete recall numbers, the RRF experiment demonstrating robustness under rank-0 injection, and the conformance matrix constitute reproducible empirical grounding that strengthens the contribution.
major comments (2)
- [Abstract] Abstract: the lossless-mapping claim to all six frameworks (mem0, Letta/MemGPT, Cognee, Zep/Graphiti, MemoryOS, MemTensor) is load-bearing for the central contribution, yet adapters are supplied for only five backends and the 12 non-passing cells in the 68/80 conformance suite are not characterized; if any represent functional gaps rather than inapplicable cases, the claim does not hold for the full set.
- [Benchmark description] Benchmark description: the reported recall@5 = 1.000, ingest p50 = 37.8 ms and recall p50 = 40.6 ms rest on a 100-fact/50-query labelled corpus (42 gold-id queries + 8 no-match probes), but the full dataset, exclusion rules, and any statistical tests are not provided, preventing independent verification of the interoperability metrics.
minor comments (1)
- [Abstract] Abstract: the framework list contains six entries while the adapter list contains five; explicitly state which frameworks lack adapters and whether the conformance suite covers them.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback emphasizing the need for precise claims on interoperability and full reproducibility of the benchmark. We address each major comment below.
read point-by-point responses
-
Referee: [Abstract] Abstract: the lossless-mapping claim to all six frameworks (mem0, Letta/MemGPT, Cognee, Zep/Graphiti, MemoryOS, MemTensor) is load-bearing for the central contribution, yet adapters are supplied for only five backends and the 12 non-passing cells in the 68/80 conformance suite are not characterized; if any represent functional gaps rather than inapplicable cases, the claim does not hold for the full set.
Authors: We agree the abstract should be tightened. The manuscript implements adapters for five backends (sqlite-vec, mem0, Letta, Cognee, pgvector) while naming six frameworks; Zep/Graphiti is listed as a target but not yet adapted. The lossless-mapping claim is scoped to the implemented adapters. The 12 non-passing cells arise from backend-specific inapplicability (e.g., emotional memory unsupported in certain vector stores, or merge/expire not exposed by a given SDK), not from missing functionality in the memorywire schema itself. We will revise the abstract to list the five supported backends explicitly, add a table characterizing the 12 cells with the reason for each non-pass, and update the contribution statement accordingly. revision: yes
-
Referee: [Benchmark description] Benchmark description: the reported recall@5 = 1.000, ingest p50 = 37.8 ms and recall p50 = 40.6 ms rest on a 100-fact/50-query labelled corpus (42 gold-id queries + 8 no-match probes), but the full dataset, exclusion rules, and any statistical tests are not provided, preventing independent verification of the interoperability metrics.
Authors: We accept that the current manuscript lacks sufficient detail for independent verification. The corpus is a synthetic, hand-labelled collection of 100 facts and 50 queries (42 with gold IDs, 8 no-match probes) constructed specifically for this evaluation. We will release the full labelled corpus, the exact exclusion rules used to define the 42 gold queries, and the query-generation procedure as supplementary material in the revision. No statistical significance tests were performed because the recall results were deterministic given the fixed gold labels and deterministic adapters; this will be stated explicitly. revision: yes
Circularity Check
No circularity; empirical validation is independent of format definition
full rationale
The manuscript defines a JSON-Schema wire format plus adapters and reports standalone empirical results (recall@5=1.000 on a labelled corpus, 68/80 conformance cells) that are produced by executing the reference implementation against external backends. No equations, fitted parameters, or self-citations are used to derive the reported metrics; the conformance suite and microbenchmarks function as external checks rather than tautological restatements of the schema. The lossless-mapping claim is an assumption whose partial empirical support (12 non-passing cells left uncharacterized) is a correctness issue, not a circularity reduction.
Axiom & Free-Parameter Ledger
axioms (1)
- standard math JSON-Schema 2020-12 provides a machine-readable contract for JSON message shapes
Reference graph
Works this paper leans on
-
[1]
Mem0: Building production-ready AI agents with scal- able long-term memory, 2025
Prateek Chhikara, Dev Khant, Saket Aryan, Taranjeet Singh, and Deshraj Yadav. Mem0: Building production-ready AI agents with scal- able long-term memory, 2025
2025
-
[2]
Cormack, Charles L
Gordon V. Cormack, Charles L. A. Clarke, and Stefan Büttcher. Reciprocal rank fusion outper- forms Condorcet and individual rank learning methods. InProceedings of the 32nd Interna- tional ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’09), pages 758–759. ACM, 2009
2009
-
[3]
JSON Schema 2020- 12 release notes
JSON Schema authors. JSON Schema 2020- 12 release notes. https://json-schema.org/ draft/2020-12/release-notes, 2020
2020
-
[4]
Evaluating very long-term conversa- tional memory of LLM agents (LoCoMo), 2024
Adyasha Maharana, Dong-Ho Lee, Sergey Tulyakov, Mohit Bansal, Francesco Barbieri, and Yuwei Fang. Evaluating very long-term conversa- tional memory of LLM agents (LoCoMo), 2024
2024
-
[5]
Web bot authentication architecture
Thibault Meunier and Watson Ladd. Web bot authentication architecture. IETF Internet-Draft draft-meunier-web-bot-auth-architecture-05, March 2026. Revision 05, dated 2 March 2026
2026
-
[6]
Model context protocol specification, version 2025- 11-25
Model Context Protocol working group. Model context protocol specification, version 2025- 11-25. https://modelcontextprotocol.io/ specification, 2025
2025
-
[7]
Patil, Ion Stoica, and Joseph E
Charles Packer, Sarah Wooders, Kevin Lin, Vi- vian Fang, Shishir G. Patil, Ion Stoica, and Joseph E. Gonzalez. MemGPT: Towards LLMs as operating systems, 2023
2023
-
[8]
Larry R. Squire. Declarative and nondeclara- tive memory: Multiple brain systems supporting learning and memory.Journal of Cognitive Neu- roscience, 4(3):232–243, 1992
1992
-
[9]
Governed memory: A production architecture for multi-agent workflows, 2026
Hamed Taheri. Governed memory: A production architecture for multi-agent workflows, 2026
2026
-
[10]
Episodic and semantic memory
Endel Tulving. Episodic and semantic memory. In Endel Tulving and Wayne Donaldson, edi- tors,Organization of Memory, pages 381–402. Academic Press, New York, 1972
1972
-
[11]
Long- MemEval: Benchmarking chat assistants on long- term interactive memory, 2024
Di Wu, Hongwei Wang, Wenhao Yu, Yuwei Zhang, Kai-Wei Chang, and Dong Yu. Long- MemEval: Benchmarking chat assistants on long- term interactive memory, 2024. 17
2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.