arxiv: 2605.03804 · v1 · submitted 2026-05-05 · 💻 cs.AI

Recognition: unknown

ScrapMem: A Bio-inspired Framework for On-device Personalized Agent Memory via Optical Forgetting

Jiale Chang, Yuxiang Ren

Authors on Pith no claims yet

Pith reviewed 2026-05-07 16:18 UTC · model grok-4.3

classification 💻 cs.AI

keywords on-device memoryLLM agentsmemory compressionoptical forgettingepisodic memory graphmultimodal memorypersonalized agentsedge AI

0 comments

The pith

ScrapMem lets LLM agents keep long-term multimodal memories on edge devices by progressively lowering the resolution of old entries.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper sets out to solve the storage and complexity problems that prevent LLM agents from maintaining useful personalized memory over long periods when running on phones or other limited hardware. It does this by turning incoming multimodal data into scrapbook-style pages, then applying optical forgetting to shrink the detail level of older pages while keeping the most recent ones intact. An Episodic Memory Graph links the remaining entries into a causal timeline so the agent can still retrieve relevant past events efficiently. Experiments on the ATM-Bench dataset show the method reaches a new best Joint@10 score of 51.0 percent, cuts memory use by as much as 93 percent, and lifts Recall@10 to 70.3 percent. If the approach works as described, on-device agents could retain weeks or months of personal context without needing constant cloud uploads or oversized local storage.

Core claim

ScrapMem integrates multimodal inputs into Scrapbook Pages, applies optical forgetting that progressively reduces resolution of older memories to cut storage cost while suppressing low-value details, and builds an Episodic Memory Graph to preserve causal-temporal relationships among key events; on the multimodal ATM-Bench this yields 51.0 percent Joint@10, up to 93 percent lower memory usage, and 70.3 percent Recall@10.

What carries the argument

Optical Forgetting, a progressive resolution-reduction step applied to older memories, supported by an Episodic Memory Graph that links events in causal-temporal order to keep retrieval accurate after compression.

If this is right

Agents running locally can sustain much longer interaction histories without exhausting device storage.
Structured graph aggregation raises the chance that relevant past episodes are retrieved even after compression.
Multimodal on-device agents become practical for personalized tasks without constant data transfer.
Memory management can shift from keeping everything to selectively discarding detail in a controlled way.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same forgetting pattern could be tested on non-LLM memory systems such as robotic state trackers to see whether resolution reduction still preserves task-critical information.
Real-device measurements of power draw and latency after applying optical forgetting would show whether the storage savings translate into usable runtime gains.
Extending the Episodic Memory Graph with explicit decay rates might allow further tuning of how quickly older events lose detail.

Load-bearing premise

Lowering the resolution of older memories keeps their semantic content usable and does not erase or distort important multimodal details that the agent will later need.

What would settle it

A controlled test in which memories compressed by optical forgetting cause the agent to give incorrect answers on questions about past events that were still present before compression, dropping performance below the reported baseline.

Figures

Figures reproduced from arXiv: 2605.03804 by Jiale Chang, Yuxiang Ren.

**Figure 1.** Figure 1: Comparison between human memory (CLS theory) and Scrapbook Memory. Top: The hippocampus rapidly encodes multimodal episodic experiences, while the neocortex gradually consolidates them into stable long-term knowledge. Bottom: ScrapMem similarly binds heterogeneous user data into scrapbook pages and progressively compresses old memories via optical forgetting, preserving core semantics for efficient retrie… view at source ↗

**Figure 2.** Figure 2: Overview of the ScrapMem. (1) Consolidation and Perception: Unifies heterogeneous records (images, videos, text) into hybrid representations via OCR and vision-to-text extraction. (2) EM-Graph Construction: Organizes nodes into an Episodic Memory Graph with event-centric paths (EM-Paths) for structured retrieval and multi-hop reasoning. (3) Optical Forgetting: Compresses outdated memories through temporal … view at source ↗

**Figure 3.** Figure 3: Retrieval performance (Recall@K) under varying optical forgetting intensities. The clustering of different forgetting curves demonstrates that ScrapMem is highly robust to specific hyperparameter configurations. quality (Q), resolution scaling factor (S), and temporal stage boundaries (T) for Recent, Mid-term, and Old memories, respectively view at source ↗

**Figure 4.** Figure 4: Storage–performance trade-off on ATMBench (Joint@10). The x-axis uses a logarithmic scale. ScrapMem (Timed-Gentle, orange star) reduces storage by 93.0% relative to the raw-data baseline while retaining over 90% of SOTA performance (46.3% vs. 51.0%). The Pareto frontier indicates strong efficiency and graceful performance degradation, supporting ondevice deployment. strengthen long-range reasoning. Exte… view at source ↗

read the original abstract

Long-term personalized memory for LLM agents is challenging on resource-limited edge devices due to high storage costs and multimodal complexity. To address this, we propose ScrapMem, a framework that integrates multimodal data into "Scrapbook Page." ScrapMem introduces Optical Forgetting, an optical compression mechanism that progressively reduces the resolution of older memories, lowering storage cost while suppressing low-value details. To maintain semantic consistency, we construct an Episodic Memory Graph (EM-Graph) that organizes key events into a causal-temporal structure. Extensive experiments on the multimodal ATM-Bench showcase that ScrapMem provides three main benefits: (1) strong performance, achieving a new state-of-the-art with a 51.0% Joint@10 score; (2) high storage efficiency, reducing memory usage by up to 93% via optical forgetting; and (3) improved recall, increasing Recall@10 to 70.3% through structured aggregation. ScrapMem offers an effective and storage-efficient solution for on-device long-term memory in multimodal LLM agents.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

ScrapMem introduces Optical Forgetting and an Episodic Memory Graph for on-device agent memory but reports big gains without the experimental details needed to check them.

read the letter

The one or two things to know are that ScrapMem brings Optical Forgetting and an Episodic Memory Graph to the table for on-device agent memory, but the reported performance jumps lack the usual experimental backing. What is new here is the optical compression that lowers resolution on older memories to save space while trying to keep the important parts, plus the graph that organizes episodes in a causal-temporal way. These aren't the standard retrieval or summarization tricks in current agent memory work. The paper does well at laying out the real constraints of multimodal data and limited storage on edge devices. The scrapbook page concept and the bio-inspired forgetting give a fresh angle on how to handle long-running personalized assistants without blowing up memory. Where it falls short is in the results section. The abstract throws out 51.0% Joint@10 as SOTA, 93% storage reduction, and 70.3% Recall@10, but skips any mention of baselines, the full ATM-Bench protocol, or ablations that would show if the forgetting step actually maintains semantic consistency and multimodal fidelity. The stress-test note is on point; without those details the efficiency claims could mask recall problems. The central argument doesn't hold up yet because the evidence isn't there to evaluate it. This is the kind of paper for engineers and researchers focused on practical deployment of LLM agents on phones or small hardware. Someone in that area might find the framework useful as a starting point for their own memory designs. I think it deserves a serious referee. The problem is important and the proposals are distinct enough that review could help strengthen the evaluation side.

Referee Report

3 major / 0 minor

Summary. The manuscript proposes ScrapMem, a bio-inspired framework for on-device long-term personalized memory in multimodal LLM agents. It integrates multimodal data into 'Scrapbook Page' structures, introduces Optical Forgetting as a progressive resolution-reduction mechanism for older memories to cut storage costs, and builds an Episodic Memory Graph (EM-Graph) to enforce causal-temporal organization of key events. Experiments on the multimodal ATM-Bench are reported to deliver a new SOTA of 51.0% Joint@10, up to 93% memory reduction, and 70.3% Recall@10 via structured aggregation.

Significance. If the empirical results hold after proper validation, ScrapMem would represent a meaningful advance for resource-constrained edge agents by addressing the tension between long-term multimodal memory and storage limits. The combination of bio-inspired compression with graph-structured retention is conceptually appealing and could influence subsequent work on efficient agent memory. No machine-checked proofs, reproducible code artifacts, or parameter-free derivations are present to credit.

major comments (3)

Abstract: The central performance claims (51.0% Joint@10 SOTA, 93% storage reduction, 70.3% Recall@10) are asserted without any description of baselines, experimental setup, error bars, statistical significance, or implementation details of Optical Forgetting, making it impossible to verify support for the claims from the available text.
Method section on Optical Forgetting: The mechanism that progressively lowers resolution of older memories is described only at a high level; no concrete algorithm, information-loss metrics, or ablations isolating its effect on semantic consistency and multimodal fidelity are supplied, which is load-bearing for both the efficiency and recall claims.
Experiments / ATM-Bench results: No quantitative evidence (e.g., retention metrics, consistency scores, or ablation tables) is given to substantiate that the Episodic Memory Graph preserves causal-temporal structure and critical multimodal details under Optical Forgetting; without these, the 93% reduction could mask unmeasured recall degradation.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed review. The comments highlight important areas for improving clarity and substantiation of our claims. We address each major comment point by point below and commit to revisions that will strengthen the manuscript without altering its core contributions.

read point-by-point responses

Referee: Abstract: The central performance claims (51.0% Joint@10 SOTA, 93% storage reduction, 70.3% Recall@10) are asserted without any description of baselines, experimental setup, error bars, statistical significance, or implementation details of Optical Forgetting, making it impossible to verify support for the claims from the available text.

Authors: We agree that the abstract, as a concise summary, omits these supporting details. The full manuscript provides baselines and setup in Section 4.1, error bars and significance testing in the results tables of Section 4, and Optical Forgetting implementation in Section 3.2. To address the concern directly, we will revise the abstract to include a brief reference to the primary baselines (e.g., standard retrieval and memory-augmented agents), the ATM-Bench evaluation protocol, and a note that detailed metrics and ablations appear in the experiments section. This change will make the performance claims more self-contained while preserving the abstract's brevity. revision: yes
Referee: Method section on Optical Forgetting: The mechanism that progressively lowers resolution of older memories is described only at a high level; no concrete algorithm, information-loss metrics, or ablations isolating its effect on semantic consistency and multimodal fidelity are supplied, which is load-bearing for both the efficiency and recall claims.

Authors: The current description emphasizes the bio-inspired motivation and high-level progressive reduction process. We acknowledge that a more concrete specification is needed to support the efficiency and fidelity claims. In the revised manuscript, we will expand Section 3.2 to include the explicit algorithm (step-wise resolution scaling with modality-specific parameters), quantitative information-loss metrics (e.g., embedding similarity and perceptual quality scores), and dedicated ablation tables isolating Optical Forgetting's contribution to storage reduction versus semantic consistency. These additions will directly substantiate the 93% reduction claim. revision: yes
Referee: Experiments / ATM-Bench results: No quantitative evidence (e.g., retention metrics, consistency scores, or ablation tables) is given to substantiate that the Episodic Memory Graph preserves causal-temporal structure and critical multimodal details under Optical Forgetting; without these, the 93% reduction could mask unmeasured recall degradation.

Authors: The reported results focus on end-to-end Joint@10 and Recall@10 metrics on ATM-Bench. We recognize that explicit evidence linking the EM-Graph to structure preservation under forgetting is required to rule out hidden degradation. We will add, in the revised experiments section, quantitative retention metrics (causal edge preservation rates and multimodal detail fidelity scores), consistency scores across forgetting levels, and ablation tables comparing performance with and without the EM-Graph. These will demonstrate that the observed recall improvements and storage savings are not achieved at the expense of unmeasured structural loss. revision: yes

Circularity Check

0 steps flagged

No significant circularity: empirical results independent of inputs

full rationale

The paper proposes ScrapMem with Optical Forgetting for progressive resolution reduction and an Episodic Memory Graph for causal-temporal organization, then reports experimental outcomes on ATM-Bench including 51.0% Joint@10, 70.3% Recall@10, and up to 93% storage reduction. No equations, parameter fits, or derivations are present that reduce any claimed prediction or result to the inputs by construction. Claims rest on external benchmark evaluation rather than self-definitional loops, fitted-input renamings, or load-bearing self-citations, rendering the chain self-contained.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 2 invented entities

The framework introduces new mechanisms without citing prior independent evidence for their effectiveness; relies on the assumption that multimodal data can be progressively compressed while retaining utility.

axioms (1)

domain assumption Multimodal memories can be progressively reduced in resolution without losing semantic value for agent tasks
Invoked to justify optical forgetting as a viable compression strategy.

invented entities (2)

Optical Forgetting no independent evidence
purpose: Progressively reduce resolution of older memories to lower storage cost
New compression mechanism central to the efficiency claim
Episodic Memory Graph (EM-Graph) no independent evidence
purpose: Organize key events into causal-temporal structure for consistency
New structure to maintain semantic consistency during compression

pith-pipeline@v0.9.0 · 5477 in / 1394 out tokens · 58629 ms · 2026-05-07T16:18:31.815932+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

78 extracted references · 30 canonical work pages · 13 internal anchors

[6]

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (ACL) , year =

Adyasha Maharana and Dong-Ho Lee and Sergey Tulyakov and Mohit Bansal and Francesco Barbieri and Yuwei Fang , title =. Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (ACL) , year =
[7]

O'Brien and Carrie Jun Cai and Meredith Ringel Morris and Percy Liang and Michael S

Joon Sung Park and Joseph C. O'Brien and Carrie Jun Cai and Meredith Ringel Morris and Percy Liang and Michael S. Bernstein , title =. Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology (UIST) , year =
[9]

Canny and Ian Fischer , title =

Kuang-Huei Lee and Xinyun Chen and Hiroki Furuta and John F. Canny and Ian Fischer , title =. International Conference on Machine Learning (ICML) , year =
[10]

Advances in Neural Information Processing Systems (NeurIPS) , year =

Tom Brown and Benjamin Mann and Nick Ryder and Melanie Subbiah and Jared Kaplan and others , title =. Advances in Neural Information Processing Systems (NeurIPS) , year =
[12]

ACM Transactions on Information Systems , volume =

Ting Bai and Le Huang and Yue Yu and Cheng Yang and Cheng Hou and Zhe Zhao and Chuan Shi , title =. ACM Transactions on Information Systems , volume =
[14]

Proceedings of the AAAI Conference on Artificial Intelligence (AAAI) , year =

Wanjun Zhong and Lianghong Guo and Qiqi Gao and He Ye and Yanlin Wang , title =. Proceedings of the AAAI Conference on Artificial Intelligence (AAAI) , year =
[15]

International Conference on Machine Learning (ICML) , year =

Sebastian Borgeaud and Arthur Mensch and Jordan Hoffmann and Trevor Cai and Eliza Rutherford and others , title =. International Conference on Machine Learning (ICML) , year =
[16]

CoRR , volume =

Darren Edge and Ha Trinh and Newman Cheng and Joshua Bradley and Alex Chao and Apurva Mody and Steven Truitt and Jonathan Larson , title =. CoRR , volume =
[17]

Advances in Neural Information Processing Systems (NeurIPS) , year =

Bernal Jimenez Gutierrez and Yiheng Shu and Yu Gu and Michihiro Yasunaga and Yu Su , title =. Advances in Neural Information Processing Systems (NeurIPS) , year =
[18]

IEEE/ACM Transactions on Audio, Speech, and Language Processing , year =

Yun Luo and Zhen Yang and Fandong Meng and Yafu Li and Jie Zhou and Yue Zhang , title =. IEEE/ACM Transactions on Audio, Speech, and Language Processing , year =
[19]

Bai and J

T. Bai and J. Fan and X. Wen and J. Kang and H. Lan and R. Zhao and P. Wu and Z. Zhang and Y. Zhong and G. Li and D. Lin , title =. arXiv preprint , year =
[20]

Liu and C

B. Liu and C. Lyu and Z. Min , title =. Proceedings of EMNLP , year =
[21]

Zhang and Y

L. Zhang and Y. Wang , title =. International Conference on Learning Representations (ICLR) , year =
[22]

Annual Meeting of the Association for Computational Linguistics (ACL) , year =

Zeng, Delong and Xie, Yuexiang and Li, Yaliang and Shen, Ying , title =. Annual Meeting of the Association for Computational Linguistics (ACL) , year =
[23]

Cai and S

D. Cai and S. Wang and C. Peng and Z. Zhang , title =. Proceedings of the International Conference on Mobile Computing and Networking (MobiCom) , year =
[24]

Li and others , title =

J. Li and others , title =. Advances in Neural Information Processing Systems (NeurIPS) , year =
[27]

McClelland and Bruce L

James L. McClelland and Bruce L. McNaughton and Randall C. O'Reilly , title =. Psychological Review , volume =
[28]

Sun and M

W. Sun and M. Advani and N. Spruston and A. Saxe and J. E. Fitzgerald , title =. Nature Neuroscience , volume =
[29]

Thota and D

M. Thota and D. Yi and G. Leontidis , title =. Knowledge-Based Systems , volume =
[31]

and Zhang, Y

Li, J. and Zhang, Y. and Yang, X. and Qu, J. and Xu, J. and Yang, S. and Ding, J. and Ngai, E. C. H. , title =. Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026) , year =

2026
[32]

and Yang, F

Feng, L. and Yang, F. and Chen, F. and Cheng, X. and Xu, H. and Wan, Z. and Yan, M. and An, B. , title =. arXiv preprint arXiv:2601.04786 , year =

work page arXiv
[33]

arXiv preprint arXiv:2601.21468 , year=

Shi, Y. and Liu, S. and Yang, Y. and Mao, W. and Chen, Y. and Gu, Q. and Su, H. and Cai, X. and Wang, X. and Zhang, A. , title =. arXiv preprint arXiv:2601.21468 , year =

work page arXiv
[34]

arXiv preprint arXiv:2603.15634 , year =

Zhang, Zeyu and Li, Rui and Zhao, Xiaoyan and Zhang, Yang and Wang, Wenjie and Chen, Xu and Chua, Tat-Seng , title =. arXiv preprint arXiv:2603.15634 , year =

work page arXiv
[35]

and others , title =

Li, X. and others , title =. arXiv preprint arXiv:2602.17692 , year =

work page arXiv
[36]

MemGPT: Towards LLMs as Operating Systems

Packer, C. and Wooders, V. and Lin, K. and Fang, S. and Shieh, G. and Fiete, I. , title =. arXiv preprint arXiv:2310.08560 , year =

work page internal anchor Pith review arXiv
[37]

and others , title =

Abdollahi, S. and others , title =. Proceedings of the 2026 EuroMLSys Conference , year =

2026
[38]

According to me: Long-term personalized referential memory qa, 2026

Mei, J. and Chen, J. and Yang, G. and Hou, X. and Li, M. and Byrne, B. , title =. arXiv preprint arXiv:2603.01990 , year =

work page arXiv
[39]

Advances in Neural Information Processing Systems (NeurIPS) , year=

A-MEM: Agentic memory for LLM agents , author=. Advances in Neural Information Processing Systems (NeurIPS) , year=
[41]

arXiv preprint , year=

From rag to memory: Non-parametric continual learning for large language models , author=. arXiv preprint , year=
[42]

International Conference on Learning Representations (ICLR) , year=

Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection , author=. International Conference on Learning Representations (ICLR) , year=
[43]

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing (EMNLP) , year=

Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks , author=. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing (EMNLP) , year=

2019
[45]

Qwen3 Embedding: Advancing Text Embedding and Reranking Through Foundation Models

Qwen3 embedding: Advancing text embedding and reranking through foundation models , author=. arXiv preprint arXiv:2506.05176 , year=

work page internal anchor Pith review arXiv
[46]

Qwen3-VL-Embedding and Qwen3-VL-Reranker: A Unified Framework for State-of-the-Art Multimodal Retrieval and Ranking

Qwen3-vl-embedding and qwen3-vl-reranker: A unified framework for state-of-the-art multimodal retrieval and ranking , author=. arXiv preprint arXiv:2601.04720 , year=

work page internal anchor Pith review arXiv
[47]

Advances in Neural Information Processing Systems (NeurIPS) , year=

Retrieval-augmented generation for knowledge-intensive NLP tasks , author=. Advances in Neural Information Processing Systems (NeurIPS) , year=
[48]

Abdollahi and 1 others

S. Abdollahi and 1 others. 2026. https://doi.org/10.1145/3805621.3807660 Agentee: Confidential llm agent execution on edge devices . Proceedings of the 2026 EuroMLSys Conference

work page doi:10.1145/3805621.3807660 2026
[49]

Shuai Bai, Yuqi Cai, Ruoyi Chen, Kai Chen, Xu Chen, Zhihao Cheng, and 1 others. 2025 a . Qwen3-vl technical report. arXiv preprint arXiv:2511.21631

work page internal anchor Pith review arXiv 2025
[50]

Shuai Bai, Keqin Chen, Xuejing Liu, Jialin Wang, Wenbin Ge, and 1 others. 2025 b . Qwen2.5-vl technical report. arXiv preprint arXiv:2502.13923

work page internal anchor Pith review arXiv 2025
[51]

T. Bai, J. Fan, X. Wen, J. Kang, H. Lan, R. Zhao, P. Wu, Z. Zhang, Y. Zhong, G. Li, and D. Lin. 2025 c . Survey on ai memory: Theories, taxonomies, evaluations, and emerging trends. arXiv preprint

2025
[52]

Ting Bai, Le Huang, Yue Yu, Cheng Yang, Cheng Hou, Zhe Zhao, and Chuan Shi. 2025 d . Efficient multi-task prompt tuning for recommendation. ACM Transactions on Information Systems, 43(4):1--21

2025
[53]

Sebastian Borgeaud, Arthur Mensch, Jordan Hoffmann, Trevor Cai, Eliza Rutherford, and 1 others. 2022. Improving language models by retrieving from trillions of tokens. In International Conference on Machine Learning (ICML), pages 2206--2240

2022
[54]

Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, and 1 others. 2020. Language models are few-shot learners. In Advances in Neural Information Processing Systems (NeurIPS), pages 1877--1901

2020
[55]

D. Cai, S. Wang, C. Peng, and Z. Zhang. 2024. Recall: Empowering multimodal embedding for edge devices. In Proceedings of the International Conference on Mobile Computing and Networking (MobiCom)

2024
[56]

Zhe Chen, Weiyun Wang, Yue Cao, Yangzhou Liu, Zhangwei Gao, and 1 others. 2024. Expanding performance boundaries of open-source multimodal models with model, data, and test-time scaling. arXiv preprint arXiv:2412.05271

work page internal anchor Pith review arXiv 2024
[57]

Prateek Chhikara, Dhananjay Khant, Sourav Aryan, Tarun Singh, and Deepak Yadav. 2025. Mem0: Building production-ready ai agents with scalable long-term memory. arXiv preprint arXiv:2504.19413

work page internal anchor Pith review arXiv 2025
[58]

Darren Edge, Ha Trinh, Newman Cheng, Joshua Bradley, Alex Chao, Apurva Mody, Steven Truitt, and Jonathan Larson. 2024. From local to global: A graph rag approach to query-focused summarization. CoRR, abs/2404.16130

work page internal anchor Pith review arXiv 2024
[59]

L. Feng, F. Yang, F. Chen, X. Cheng, H. Xu, Z. Wan, M. Yan, and B. An. 2026. https://doi.org/10.48550/arXiv.2601.04786 Agentocr: Reimagining agent history via optical self-compression . arXiv preprint arXiv:2601.04786

work page doi:10.48550/arxiv.2601.04786 2026
[60]

Y. Fu, R. Anantha, and J. Cheng. 2024. Camphor: Collaborative agents for multi-input planning and high-order reasoning on device. arXiv preprint arXiv:2410.09407

work page arXiv 2024
[61]

Gutierrez, Yu Shu, Weijia Qi, Shuo Zhou, and Yu Su

Bernal J. Gutierrez, Yu Shu, Weijia Qi, Shuo Zhou, and Yu Su. 2025. From rag to memory: Non-parametric continual learning for large language models. arXiv preprint. Published at ICLR 2025 / OpenReview LWH8yn4HS2

2025
[62]

Bernal Jimenez Gutierrez, Yiheng Shu, Yu Gu, Michihiro Yasunaga, and Yu Su. 2024. Hipporag: Neurobiologically inspired long-term memory for large language models. In Advances in Neural Information Processing Systems (NeurIPS)

2024
[63]

LoRA: Low-Rank Adaptation of Large Language Models

Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, and 1 others. 2021. Lora: Low-rank adaptation of large language models. arXiv preprint arXiv:2106.09685

work page internal anchor Pith review arXiv 2021
[64]

Canny, and Ian Fischer

Kuang-Huei Lee, Xinyun Chen, Hiroki Furuta, John F. Canny, and Ian Fischer. 2024. A human-inspired reading agent with gist memory of very long contexts. In International Conference on Machine Learning (ICML)

2024
[65]

Patrick Lewis, Ethan Perez, Aleksander Piktus, and 1 others. 2020. Retrieval-augmented generation for knowledge-intensive nlp tasks. In Advances in Neural Information Processing Systems (NeurIPS)

2020
[66]

J. Li, Y. Zhang, X. Yang, J. Qu, J. Xu, S. Yang, J. Ding, and E. C. H. Ngai. 2026 a . https://doi.org/10.48550/arXiv.2604.26622 Ocr-memory: Optical context retrieval for long-horizon agent memory . Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026)

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2604.26622 2026
[67]

Li and 1 others

J. Li and 1 others. 2025. Venus: An efficient edge memory-and-retrieval system for vlm-based online video understanding. In Advances in Neural Information Processing Systems (NeurIPS)

2025
[68]

and others , title =

X. Li and 1 others. 2026 b . https://doi.org/10.48550/arXiv.2602.17692 Agentic unlearning: When llm agent meets machine unlearning . arXiv preprint arXiv:2602.17692

work page doi:10.48550/arxiv.2602.17692 2026
[69]

B. Liu, C. Lyu, and Z. Min. 2024. Retrieval meets reasoning: Even high-school textbook knowledge benefits multimodal reasoning. In Proceedings of EMNLP

2024
[70]

J. Liu, Y. Sun, W. Cheng, H. Lei, Y. Chen, and 1 others. 2025. Memverse: Multimodal memory for lifelong learning agents. arXiv preprint arXiv:2512.03627

work page arXiv 2025
[71]

Miao Lu, Weiwei Sun, Weihua Du, Zhan Ling, Xuesong Yao, Kang Liu, and Jiecao Chen. 2025. Scaling llm multi-turn rl with end-to-end summarization-based context management. arXiv preprint arXiv:2510.06727

work page arXiv 2025
[72]

Yun Luo, Zhen Yang, Fandong Meng, Yafu Li, Jie Zhou, and Yue Zhang. 2025. An empirical study of catastrophic forgetting in large language models during continual fine-tuning. IEEE/ACM Transactions on Audio, Speech, and Language Processing

2025
[73]

Adyasha Maharana, Dong-Ho Lee, Sergey Tulyakov, Mohit Bansal, Francesco Barbieri, and Yuwei Fang. 2024. Evaluating very long-term conversational memory of llm agents. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (ACL), pages 13851--13870

2024
[74]

McClelland, Bruce L

James L. McClelland, Bruce L. McNaughton, and Randall C. O'Reilly. 1995. Why there are complementary learning systems in the hippocampus and neocortex. Psychological Review, 102(3):419--457

1995
[75]

J. Mei, J. Chen, G. Yang, X. Hou, M. Li, and B. Byrne. 2026. https://doi.org/10.48550/arXiv.2603.01990 According to me: Long-term personalized referential memory qa . arXiv preprint arXiv:2603.01990

work page doi:10.48550/arxiv.2603.01990 2026
[76]

MemGPT: Towards LLMs as Operating Systems

C. Packer, V. Wooders, K. Lin, S. Fang, G. Shieh, and I. Fiete. 2023. https://doi.org/10.48550/arXiv.2310.08560 Memgpt: Towards llms as operating systems . arXiv preprint arXiv:2310.08560

work page internal anchor Pith review doi:10.48550/arxiv.2310.08560 2023
[77]

O'Brien, Carrie Jun Cai, Meredith Ringel Morris, Percy Liang, and Michael S

Joon Sung Park, Joseph C. O'Brien, Carrie Jun Cai, Meredith Ringel Morris, Percy Liang, and Michael S. Bernstein. 2023. Generative agents: Interactive simulacra of human behavior. In Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology (UIST), pages 2:1--2:22

2023
[78]

Nils Reimers and Iryna Gurevych. 2019. Sentence-bert: Sentence embeddings using siamese bert-networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing (EMNLP)

2019
[79]

Y. Shi, S. Liu, Y. Yang, W. Mao, Y. Chen, Q. Gu, H. Su, X. Cai, X. Wang, and A. Zhang. 2026. https://doi.org/10.48550/arXiv.2601.21468 Memocr: Layout-aware visual memory for efficient long-horizon reasoning . arXiv preprint arXiv:2601.21468

work page doi:10.48550/arxiv.2601.21468 2026
[80]

W. Sun, M. Advani, N. Spruston, A. Saxe, and J. E. Fitzgerald. 2023. Organizing memories for generalization in complementary learning systems. Nature Neuroscience, 26(8):1438--1448

2023
[81]

Thota, D

M. Thota, D. Yi, and G. Leontidis. 2023. Lleda---lifelong self-supervised domain adaptation. Knowledge-Based Systems, 279:110959

2023
[82]

Haoran Wei, Yaofeng Sun, and Yukun Li. 2025. Deepseek-ocr: Contexts optical compression. arXiv preprint arXiv:2510.18234

work page internal anchor Pith review arXiv 2025
[83]

B. Wu, Y. Li, Z. Zhang, Y. Wei, M. Fang, and L. Chen. 2024. Foundations and recent trends in multimodal mobile agents: A survey. arXiv preprint arXiv:2411.02006

work page arXiv 2024
[84]

Jiajun Xu, Zhiyuan Li, Wei Chen, Qun Wang, Xin Gao, Qi Cai, and Ziyuan Ling. 2024. On-device language models: A comprehensive review. arXiv preprint arXiv:2409.00088

work page arXiv 2024
[85]

Weiran Xu, Zhijian Liang, Jingbiao Mei, Hang Gao, Jie Tan, and Yi Zhang. 2025. A-mem: Agentic memory for llm agents. In Advances in Neural Information Processing Systems (NeurIPS)

2025
[86]

Zhongkai Yu, Shengwen Liang, Tianyun Ma, Yunke Cai, Ziyuan Nan, Di Huang, Xinkai Song, Yifan Hao, Jie Zhang, Tian Zhi, Yongwei Zhao, Zidong Du, Xing Hu, Qi Guo, and Tianshi Chen. 2024. Cambricon-llm: A chiplet-based hybrid architecture for on-device inference of 70b llm. arXiv preprint arXiv:2409.15654

work page arXiv 2024
[87]

Delong Zeng, Yuexiang Xie, Yaliang Li, and Ying Shen. 2025. Enhancing multimodal retrieval via complementary information extraction and alignment. In Annual Meeting of the Association for Computational Linguistics (ACL)

2025
[88]

Zhang and Y

L. Zhang and Y. Wang. 2026. Trace: Grounding time series in context for multimodal embedding and retrieval. In International Conference on Learning Representations (ICLR)

2026
[89]

Zeyu Zhang, Rui Li, Xiaoyan Zhao, Yang Zhang, Wenjie Wang, Xu Chen, and Tat-Seng Chua. 2026. https://doi.org/10.48550/arXiv.2603.15634 Nextmem: Towards latent factual memory for llm-based agents . arXiv preprint arXiv:2603.15634

work page doi:10.48550/arxiv.2603.15634 2026
[90]

Yaowei Zheng, Richong Zhang, Junhao Zhang, Yanhan Ye, Zheyan Luo, and 1 others. 2024. Llamafactory: Unified efficient fine-tuning of 100+ language models. arXiv preprint arXiv:2403.13372

work page internal anchor Pith review arXiv 2024
[91]

Wanjun Zhong, Lianghong Guo, Qiqi Gao, He Ye, and Yanlin Wang. 2024. Memorybank: Enhancing large language models with long-term memory. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), pages 19724--19731

2024