memorywire: A Vendor-Neutral Wire Format for Agent Memory Operations

Thamilvendhan Munirathinam

arxiv: 2606.01138 · v2 · pith:LBAYK2BInew · submitted 2026-05-31 · 💻 cs.CR · cs.AI· cs.DC

memorywire: A Vendor-Neutral Wire Format for Agent Memory Operations

Thamilvendhan Munirathinam This is my paper

Pith reviewed 2026-06-28 17:07 UTC · model grok-4.3

classification 💻 cs.CR cs.AIcs.DC

keywords memorywirewire formatagent memoryJSON Schemamemory operationsvendor neutralHITL governancememory store interface

0 comments

The pith

A JSON-Schema wire format standardizes five memory operations over four memory types for use across agent frameworks.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes memorywire as a vendor-neutral JSON-Schema 2020-12 format that encodes remember, recall, forget, merge, and expire operations on semantic, episodic, procedural, and emotional memory. It supplies a MemoryStore interface, a fan-out router, and an optional human-in-the-loop governance channel. If correct, the format removes the need for bespoke integrations and full memory rebuilds when moving between frameworks such as mem0, Letta, Cognee, Zep, MemoryOS, and MemTensor. The reference implementation demonstrates this through backend adapters, a labelled corpus benchmark, an adversarial fusion test, and a cross-adapter conformance suite.

Core claim

memorywire is a JSON-Schema 2020-12 wire format for five memory operations over four memory types, equipped with a MemoryStore interface, a fan-out router, and an optional HITL governance channel, that can be mapped to the internal models of existing agent-memory frameworks through adapters and validated by performance and conformance measurements.

What carries the argument

The memorywire JSON-Schema 2020-12 specification, which encodes the five operations and four memory types as a common operational vocabulary.

If this is right

Agent frameworks can adopt a shared interface for memory writes and reads instead of maintaining separate SDKs.
Memory data can migrate between frameworks without reconstruction from raw sources.
Human review of proposed memory writes becomes available through the optional governance channel.
Reciprocal rank fusion of results from multiple backends preserves recall@5 = 1.000 where simple max fusion drops to 0.500.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The format could serve as an underlying layer that higher-level agent protocols compose with rather than replace.
New memory types or operations could be added as optional extensions if future frameworks require them.
The observed ingest and recall latencies suggest the format remains practical for agents that issue frequent memory calls.

Load-bearing premise

The five operations and four memory types form a complete vocabulary that maps without functional loss onto the internal models of the listed frameworks.

What would settle it

A required memory operation or memory type from one of the six frameworks that cannot be expressed using the five operations and four types defined in memorywire.

Figures

Figures reproduced from arXiv: 2606.01138 by Thamilvendhan Munirathinam.

**Figure 1.** Figure 1: memorywire architecture. A client (SDK or CLI) issues one of the five spec operations against the Memory facade, which validates the request and delegates to the MemoryRouter. The router fans out across heterogeneous backend adapters (sqlite-vec, mem0, Letta, Cognee, pgvector) and fuses recall results with Reciprocal Rank Fusion (k = 60) plus an optional one-hop graph boost. Procedural writes take a paral… view at source ↗

read the original abstract

Agent-memory frameworks -- mem0, Letta/MemGPT, Cognee, Zep/Graphiti, MemoryOS, MemTensor -- each ship their own SDK, storage layout, and operational vocabulary. There is no shared wire format: every integration is bespoke, every migration rebuilds memory from scratch, and no framework ships a governance surface that lets a human review writes before they enter long-term storage. We present memorywire, a JSON-Schema 2020-12 wire format for five memory operations (remember, recall, forget, merge, expire) over four memory types (semantic, episodic, procedural, emotional), with a MemoryStore interface, a fan-out router, and an optional HITL governance channel. We describe an open-source reference implementation with five backend adapters (sqlite-vec, mem0, Letta, Cognee, pgvector); a microbenchmark on a 100-fact / 50-query labelled corpus (42 with non-empty gold ids + 8 no-match probes) achieving recall@5 = 1.000 on the 42 gold-id queries with ingest p50 = 37.8 ms and recall p50 = 40.6 ms; an adversarial-fusion experiment showing Reciprocal Rank Fusion holds recall@5 = 1.000 across a 1-of-N rank-0 injection sweep (K in {0, 5, ..., 50}) where max fusion collapses to 0.500 with 80% leak at K >= 5; and a 16-scenario cross-adapter conformance suite passing 68 of 80 cells with zero failures. The contribution is not a new algorithm; it is a packaging of established components (RRF, FSMs, STM/LTM consolidation, diff-and-approve workflows) into a venue-neutral protocol with an empirically validated reference, positioned to compose with the Model Context Protocol rather than compete with it.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

memorywire supplies a JSON schema for five memory ops over four types plus adapters and benchmarks, but the lossless mapping to all six frameworks rests on 68/80 cells whose gaps are not explained.

read the letter

The main takeaway is that this paper defines a JSON-Schema 2020-12 wire format for remember, recall, forget, merge, and expire operations across semantic, episodic, procedural, and emotional memory, ships a MemoryStore interface with a fan-out router, and includes a reference implementation with five adapters.

What is actually new is the specific combination of that schema, the five-operation vocabulary, the cross-adapter conformance suite, and the adversarial RRF fusion test. The work does well by delivering concrete, falsifiable numbers: recall@5 of 1.000 on the 42 gold queries, ingest and recall p50 latencies around 38-41 ms, and the demonstration that RRF holds recall at 1.000 across the rank-0 injection sweep while max fusion drops to 0.500.

The soft spot is the completeness claim. The abstract reports 68 of 80 conformance cells passing with zero failures across 16 scenarios, yet the 12 non-passing cells are not characterized. Only five adapters are implemented even though six frameworks are named in the motivation. If those 12 cells reflect missing coverage rather than inapplicable cases, the lossless mapping does not fully hold. The abstract also omits the full dataset, exclusion rules, and any statistical tests, so the robustness of the interoperability result is harder to judge from the given material.

Engineers who integrate or migrate between agent memory systems will get practical value from the schema definition and the open adapters. Readers looking for a ready protocol rather than new theory will find the most use.

The paper deserves a serious referee because it contains verifiable artifacts and addresses a clear engineering gap. I would send it to review and ask the authors to document the non-conforming cells and expand adapter coverage if possible.

Referee Report

2 major / 1 minor

Summary. The manuscript introduces memorywire, a JSON-Schema 2020-12 wire format for five memory operations (remember, recall, forget, merge, expire) over four memory types (semantic, episodic, procedural, emotional), together with a MemoryStore interface, fan-out router, and optional HITL governance channel. It supplies an open-source reference implementation with five backend adapters, a microbenchmark achieving recall@5 = 1.000 on 42 gold queries from a 100-fact/50-query corpus, an RRF-vs-max-fusion adversarial experiment, and a 16-scenario cross-adapter conformance suite that passes 68 of 80 cells with zero failures.

Significance. If the interoperability and lossless-mapping claims hold, the work supplies a concrete, vendor-neutral protocol that could materially reduce bespoke integrations and rebuilds across agent-memory frameworks. The open-source reference implementation, the concrete recall numbers, the RRF experiment demonstrating robustness under rank-0 injection, and the conformance matrix constitute reproducible empirical grounding that strengthens the contribution.

major comments (2)

[Abstract] Abstract: the lossless-mapping claim to all six frameworks (mem0, Letta/MemGPT, Cognee, Zep/Graphiti, MemoryOS, MemTensor) is load-bearing for the central contribution, yet adapters are supplied for only five backends and the 12 non-passing cells in the 68/80 conformance suite are not characterized; if any represent functional gaps rather than inapplicable cases, the claim does not hold for the full set.
[Benchmark description] Benchmark description: the reported recall@5 = 1.000, ingest p50 = 37.8 ms and recall p50 = 40.6 ms rest on a 100-fact/50-query labelled corpus (42 gold-id queries + 8 no-match probes), but the full dataset, exclusion rules, and any statistical tests are not provided, preventing independent verification of the interoperability metrics.

minor comments (1)

[Abstract] Abstract: the framework list contains six entries while the adapter list contains five; explicitly state which frameworks lack adapters and whether the conformance suite covers them.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback emphasizing the need for precise claims on interoperability and full reproducibility of the benchmark. We address each major comment below.

read point-by-point responses

Referee: [Abstract] Abstract: the lossless-mapping claim to all six frameworks (mem0, Letta/MemGPT, Cognee, Zep/Graphiti, MemoryOS, MemTensor) is load-bearing for the central contribution, yet adapters are supplied for only five backends and the 12 non-passing cells in the 68/80 conformance suite are not characterized; if any represent functional gaps rather than inapplicable cases, the claim does not hold for the full set.

Authors: We agree the abstract should be tightened. The manuscript implements adapters for five backends (sqlite-vec, mem0, Letta, Cognee, pgvector) while naming six frameworks; Zep/Graphiti is listed as a target but not yet adapted. The lossless-mapping claim is scoped to the implemented adapters. The 12 non-passing cells arise from backend-specific inapplicability (e.g., emotional memory unsupported in certain vector stores, or merge/expire not exposed by a given SDK), not from missing functionality in the memorywire schema itself. We will revise the abstract to list the five supported backends explicitly, add a table characterizing the 12 cells with the reason for each non-pass, and update the contribution statement accordingly. revision: yes
Referee: [Benchmark description] Benchmark description: the reported recall@5 = 1.000, ingest p50 = 37.8 ms and recall p50 = 40.6 ms rest on a 100-fact/50-query labelled corpus (42 gold-id queries + 8 no-match probes), but the full dataset, exclusion rules, and any statistical tests are not provided, preventing independent verification of the interoperability metrics.

Authors: We accept that the current manuscript lacks sufficient detail for independent verification. The corpus is a synthetic, hand-labelled collection of 100 facts and 50 queries (42 with gold IDs, 8 no-match probes) constructed specifically for this evaluation. We will release the full labelled corpus, the exact exclusion rules used to define the 42 gold queries, and the query-generation procedure as supplementary material in the revision. No statistical significance tests were performed because the recall results were deterministic given the fixed gold labels and deterministic adapters; this will be stated explicitly. revision: yes

Circularity Check

0 steps flagged

No circularity; empirical validation is independent of format definition

full rationale

The manuscript defines a JSON-Schema wire format plus adapters and reports standalone empirical results (recall@5=1.000 on a labelled corpus, 68/80 conformance cells) that are produced by executing the reference implementation against external backends. No equations, fitted parameters, or self-citations are used to derive the reported metrics; the conformance suite and microbenchmarks function as external checks rather than tautological restatements of the schema. The lossless-mapping claim is an assumption whose partial empirical support (12 non-passing cells left uncharacterized) is a correctness issue, not a circularity reduction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The paper introduces a new protocol specification but relies on pre-existing components (RRF, FSMs, STM/LTM consolidation, diff-and-approve workflows) and the JSON-Schema standard; no new physical entities or fitted constants are introduced.

axioms (1)

standard math JSON-Schema 2020-12 provides a machine-readable contract for JSON message shapes
Invoked to define the wire format for the five operations.

pith-pipeline@v0.9.1-grok · 5883 in / 1367 out tokens · 23578 ms · 2026-06-28T17:07:56.327336+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

11 extracted references

[1]

Mem0: Building production-ready AI agents with scal- able long-term memory, 2025

Prateek Chhikara, Dev Khant, Saket Aryan, Taranjeet Singh, and Deshraj Yadav. Mem0: Building production-ready AI agents with scal- able long-term memory, 2025

2025
[2]

Cormack, Charles L

Gordon V. Cormack, Charles L. A. Clarke, and Stefan Büttcher. Reciprocal rank fusion outper- forms Condorcet and individual rank learning methods. InProceedings of the 32nd Interna- tional ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’09), pages 758–759. ACM, 2009

2009
[3]

JSON Schema 2020- 12 release notes

JSON Schema authors. JSON Schema 2020- 12 release notes. https://json-schema.org/ draft/2020-12/release-notes, 2020

2020
[4]

Evaluating very long-term conversa- tional memory of LLM agents (LoCoMo), 2024

Adyasha Maharana, Dong-Ho Lee, Sergey Tulyakov, Mohit Bansal, Francesco Barbieri, and Yuwei Fang. Evaluating very long-term conversa- tional memory of LLM agents (LoCoMo), 2024

2024
[5]

Web bot authentication architecture

Thibault Meunier and Watson Ladd. Web bot authentication architecture. IETF Internet-Draft draft-meunier-web-bot-auth-architecture-05, March 2026. Revision 05, dated 2 March 2026

2026
[6]

Model context protocol specification, version 2025- 11-25

Model Context Protocol working group. Model context protocol specification, version 2025- 11-25. https://modelcontextprotocol.io/ specification, 2025

2025
[7]

Patil, Ion Stoica, and Joseph E

Charles Packer, Sarah Wooders, Kevin Lin, Vi- vian Fang, Shishir G. Patil, Ion Stoica, and Joseph E. Gonzalez. MemGPT: Towards LLMs as operating systems, 2023

2023
[8]

Larry R. Squire. Declarative and nondeclara- tive memory: Multiple brain systems supporting learning and memory.Journal of Cognitive Neu- roscience, 4(3):232–243, 1992

1992
[9]

Governed memory: A production architecture for multi-agent workflows, 2026

Hamed Taheri. Governed memory: A production architecture for multi-agent workflows, 2026

2026
[10]

Episodic and semantic memory

Endel Tulving. Episodic and semantic memory. In Endel Tulving and Wayne Donaldson, edi- tors,Organization of Memory, pages 381–402. Academic Press, New York, 1972

1972
[11]

Long- MemEval: Benchmarking chat assistants on long- term interactive memory, 2024

Di Wu, Hongwei Wang, Wenhao Yu, Yuwei Zhang, Kai-Wei Chang, and Dong Yu. Long- MemEval: Benchmarking chat assistants on long- term interactive memory, 2024. 17

2024

[1] [1]

Mem0: Building production-ready AI agents with scal- able long-term memory, 2025

Prateek Chhikara, Dev Khant, Saket Aryan, Taranjeet Singh, and Deshraj Yadav. Mem0: Building production-ready AI agents with scal- able long-term memory, 2025

2025

[2] [2]

Cormack, Charles L

Gordon V. Cormack, Charles L. A. Clarke, and Stefan Büttcher. Reciprocal rank fusion outper- forms Condorcet and individual rank learning methods. InProceedings of the 32nd Interna- tional ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’09), pages 758–759. ACM, 2009

2009

[3] [3]

JSON Schema 2020- 12 release notes

JSON Schema authors. JSON Schema 2020- 12 release notes. https://json-schema.org/ draft/2020-12/release-notes, 2020

2020

[4] [4]

Evaluating very long-term conversa- tional memory of LLM agents (LoCoMo), 2024

Adyasha Maharana, Dong-Ho Lee, Sergey Tulyakov, Mohit Bansal, Francesco Barbieri, and Yuwei Fang. Evaluating very long-term conversa- tional memory of LLM agents (LoCoMo), 2024

2024

[5] [5]

Web bot authentication architecture

Thibault Meunier and Watson Ladd. Web bot authentication architecture. IETF Internet-Draft draft-meunier-web-bot-auth-architecture-05, March 2026. Revision 05, dated 2 March 2026

2026

[6] [6]

Model context protocol specification, version 2025- 11-25

Model Context Protocol working group. Model context protocol specification, version 2025- 11-25. https://modelcontextprotocol.io/ specification, 2025

2025

[7] [7]

Patil, Ion Stoica, and Joseph E

Charles Packer, Sarah Wooders, Kevin Lin, Vi- vian Fang, Shishir G. Patil, Ion Stoica, and Joseph E. Gonzalez. MemGPT: Towards LLMs as operating systems, 2023

2023

[8] [8]

Larry R. Squire. Declarative and nondeclara- tive memory: Multiple brain systems supporting learning and memory.Journal of Cognitive Neu- roscience, 4(3):232–243, 1992

1992

[9] [9]

Governed memory: A production architecture for multi-agent workflows, 2026

Hamed Taheri. Governed memory: A production architecture for multi-agent workflows, 2026

2026

[10] [10]

Episodic and semantic memory

Endel Tulving. Episodic and semantic memory. In Endel Tulving and Wayne Donaldson, edi- tors,Organization of Memory, pages 381–402. Academic Press, New York, 1972

1972

[11] [11]

Long- MemEval: Benchmarking chat assistants on long- term interactive memory, 2024

Di Wu, Hongwei Wang, Wenhao Yu, Yuwei Zhang, Kai-Wei Chang, and Dong Yu. Long- MemEval: Benchmarking chat assistants on long- term interactive memory, 2024. 17

2024