arxiv: 2504.15965 · v2 · submitted 2025-04-22 · 💻 cs.IR

Recognition: no theorem link

From Human Memory to AI Memory: A Survey on Memory Mechanisms in the Era of LLMs

Yaxiong Wu , Sheng Liang , Chen Zhang , Yichao Wang , Yongyue Zhang , Huifeng Guo , Ruiming Tang , Yong Liu

Authors on Pith no claims yet

Pith reviewed 2026-05-17 11:00 UTC · model grok-4.3

classification 💻 cs.IR

keywords memory mechanismslarge language modelshuman memoryAI memoryLLM surveycategorization frameworkmemory dimensions

0 comments

The pith

This survey connects categories of human memory to memory in LLM-based AI systems and introduces a three-dimension eight-quadrant framework to organize the field.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper examines how memory functions in humans, including different types like episodic and semantic memory, and draws parallels to how AI systems store and retrieve information from interactions. It then reviews existing research on memory in large language models and proposes a structured way to categorize this work using three key dimensions: the object of memory, its form, and the time aspect. These dimensions create eight distinct quadrants for classification. The goal is to use insights from human memory to guide the development of more advanced memory capabilities in AI. The survey concludes by discussing current limitations in AI memory and potential paths forward for improvement in the era of LLMs.

Core claim

By conducting a detailed analysis of human memory categories and relating them directly to the memory of AI systems, and by systematically organizing existing memory-related work into a categorization based on three dimensions of object, form, and time that results in eight quadrants, this survey provides a comprehensive view that can inspire the construction of more powerful memory mechanisms for LLM-driven AI systems.

What carries the argument

The three dimensions of object, form, and time, which divide memory mechanisms into eight quadrants, serving as the organizing framework that links human memory insights to AI memory implementations.

Load-bearing premise

That analyzing human memory categories and mapping them to AI memory, along with the proposed three-dimension eight-quadrant categorization, will lead to actionable insights for building improved memory mechanisms in large language models.

What would settle it

An experiment showing that LLM memory designs based on this human memory mapping and quadrant categorization do not outperform existing ad-hoc approaches in retaining and using past information effectively.

read the original abstract

Memory is the process of encoding, storing, and retrieving information, allowing humans to retain experiences, knowledge, skills, and facts over time, and serving as the foundation for growth and effective interaction with the world. It plays a crucial role in shaping our identity, making decisions, learning from past experiences, building relationships, and adapting to changes. In the era of large language models (LLMs), memory refers to the ability of an AI system to retain, recall, and use information from past interactions to improve future responses and interactions. Although previous research and reviews have provided detailed descriptions of memory mechanisms, there is still a lack of a systematic review that summarizes and analyzes the relationship between the memory of LLM-driven AI systems and human memory, as well as how we can be inspired by human memory to construct more powerful memory systems. To achieve this, in this paper, we propose a comprehensive survey on the memory of LLM-driven AI systems. In particular, we first conduct a detailed analysis of the categories of human memory and relate them to the memory of AI systems. Second, we systematically organize existing memory-related work and propose a categorization method based on three dimensions (object, form, and time) and eight quadrants. Finally, we illustrate some open problems regarding the memory of current AI systems and outline possible future directions for memory in the era of large language models.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

A useful but modest survey that maps human memory categories onto LLM techniques and offers a three-dimension eight-quadrant taxonomy without strong evidence that the scheme is independent or generative.

read the letter

The main takeaway is that this is a literature survey proposing a new organizational lens for memory work in LLMs rather than any fresh mechanism or result. It reviews human memory distinctions like episodic versus semantic and short- versus long-term, then maps them to current AI approaches such as context windows, retrieval, and parameter updates. The concrete addition is the three-axis scheme (object, form, time) that produces eight quadrants for sorting existing papers.

Referee Report

2 major / 2 minor

Summary. The manuscript surveys memory mechanisms in LLM-driven AI systems. It first analyzes categories of human memory (episodic, semantic, procedural, short-term, long-term) and relates them to AI memory components such as context windows, parametric knowledge, RAG, and external stores. It then organizes existing literature via a proposed three-dimension (object, form, time) eight-quadrant taxonomy and concludes with open problems and future directions for memory design inspired by human cognition.

Significance. If the taxonomy proves robust and the human-to-AI mapping yields design guidance beyond existing enumerations of context extension and retrieval methods, the survey could help structure research on persistent, adaptive memory in LLMs. The explicit bridging of cognitive categories to engineered mechanisms is a potential strength, provided the framework demonstrates independence of dimensions and identifies implementable improvements.

major comments (2)

[Abstract and §2] Abstract and §2 (human memory analysis): the direct mapping of human categories (e.g., episodic memory) onto LLM mechanisms (e.g., context windows or external vector stores) is presented as inspirational without addressing core mismatches in consolidation and interference mechanisms; this mapping is load-bearing for the claim that human memory can guide construction of more powerful AI systems.
[§3] §3 (proposed categorization): the assertion that the three dimensions (object, form, time) are sufficiently independent to generate eight meaningful quadrants lacks explicit justification or empirical check for orthogonality; if form is largely determined by object or time in current LLM architectures (parametric vs. retrieval-based), the scheme reduces to fewer effective dimensions and undermines the organizational contribution.

minor comments (2)

[Introduction] The literature search protocol, databases, and inclusion/exclusion criteria are not stated, which is required for a systematic survey to allow assessment of coverage and bias.
[§3] Figure or table illustrating the eight quadrants would improve clarity; currently the dimensions are described only textually.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. We address each major comment below and indicate where we will revise the manuscript to strengthen the presentation of the human-to-AI mapping and the proposed taxonomy.

read point-by-point responses

Referee: [Abstract and §2] Abstract and §2 (human memory analysis): the direct mapping of human categories (e.g., episodic memory) onto LLM mechanisms (e.g., context windows or external vector stores) is presented as inspirational without addressing core mismatches in consolidation and interference mechanisms; this mapping is load-bearing for the claim that human memory can guide construction of more powerful AI systems.

Authors: We agree that the current treatment would be improved by explicitly discussing mismatches between biological and artificial memory. While the mapping is intended as inspirational rather than literal, we will revise §2 to add a short subsection on limitations of the analogy. This subsection will cover differences in consolidation (e.g., human offline replay and synaptic consolidation versus LLM fine-tuning or retrieval augmentation) and interference management (e.g., biological forgetting curves versus LLM techniques such as memory editing or selective context truncation). The revision will clarify the scope of the design guidance while preserving the high-level parallels that motivate the survey. revision: yes
Referee: [§3] §3 (proposed categorization): the assertion that the three dimensions (object, form, time) are sufficiently independent to generate eight meaningful quadrants lacks explicit justification or empirical check for orthogonality; if form is largely determined by object or time in current LLM architectures (parametric vs. retrieval-based), the scheme reduces to fewer effective dimensions and undermines the organizational contribution.

Authors: We thank the referee for highlighting the need for stronger justification. In the revised §3 we will add a paragraph explaining the conceptual independence of the three dimensions: 'object' concerns the nature of the stored content, 'form' concerns the representation mechanism, and 'time' concerns retention duration. We will also provide an empirical check by tabulating the distribution of the surveyed papers across the eight quadrants, showing that all quadrants contain distinct contributions and are not trivially reducible. Where current architectures exhibit correlations between dimensions, we will note these as open challenges rather than assuming perfect orthogonality. revision: yes

Circularity Check

0 steps flagged

Survey proposes taxonomy with no derivation chain or self-referential reduction

full rationale

This is a literature survey paper with no mathematical derivations, equations, fitted parameters, or predictions. The core contribution is an analysis of human memory categories mapped to AI systems plus a proposed three-dimension (object, form, time) eight-quadrant organizational scheme for existing work. These are descriptive and constructive proposals resting on external citations rather than any internal definition that reduces to itself or a self-citation load-bearing premise. No step in the provided abstract or described structure exhibits the enumerated circularity patterns; the taxonomy is presented as an independent organizing framework, not derived from or equivalent to its inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The survey rests on standard domain assumptions about memory processes and literature review practices. No free parameters or invented entities are introduced.

axioms (1)

domain assumption Human memory categories can be meaningfully related to memory mechanisms in LLM-driven AI systems.
Invoked in the first part of the survey when analyzing categories of human memory and relating them to AI memory.

pith-pipeline@v0.9.0 · 5566 in / 1281 out tokens · 79996 ms · 2026-05-17T11:00:17.372542+00:00 · methodology

discussion (0)

Forward citations

Cited by 19 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Hackers or Hallucinators? A Comprehensive Analysis of LLM-Based Automated Penetration Testing
cs.CR 2026-04 unverdicted novelty 8.0

The first SoK on LLM-based AutoPT frameworks provides a six-dimension taxonomy of agent designs and a unified empirical benchmark evaluating 15 frameworks via over 10 billion tokens and 1,500 manually reviewed logs.
Goal-Oriented Reasoning for RAG-based Memory in Conversational Agentic LLM Systems
cs.AI 2026-05 unverdicted novelty 7.0

Goal-Mem improves RAG memory retrieval in agentic LLMs by explicit goal decomposition and backward chaining via Natural Language Logic, outperforming nine baselines on multi-hop and implicit inference tasks.
EquiMem: Calibrating Shared Memory in Multi-Agent Debate via Game-Theoretic Equilibrium
cs.AI 2026-05 unverdicted novelty 7.0

EquiMem calibrates shared memory in multi-agent debate by computing a game-theoretic equilibrium from agent queries and paths, outperforming heuristics and LLM validators across benchmarks while remaining robust to ad...
Learning How and What to Memorize: Cognition-Inspired Two-Stage Optimization for Evolving Memory
cs.CL 2026-05 unverdicted novelty 7.0

MemCoE learns memory organization guidelines via contrastive feedback and then trains a guideline-aligned RL policy for memory updates, yielding consistent gains on personalization benchmarks.
SAGE: A Self-Evolving Agentic Graph-Memory Engine for Structure-Aware Associative Memory
cs.AI 2026-05 unverdicted novelty 6.0

SAGE is a self-evolving agentic graph-memory engine that dynamically constructs and refines structured memory graphs via writer-reader feedback, yielding performance gains on multi-hop QA, open-domain retrieval, and l...
The Trap of Trajectory: Towards Understanding and Mitigating Spurious Correlations in Agentic Memory
cs.LG 2026-05 unverdicted novelty 6.0

Agentic memory improves clean reasoning but worsens performance when spurious patterns are present in stored trajectories; CAMEL calibration reduces this reliance while preserving clean performance.
Sycophantic AI makes human interaction feel more effortful and less satisfying over time
cs.HC 2026-05 conditional novelty 6.0

Sycophantic AI delivers quick emotional support like friends but over weeks shifts users toward AI for advice and reduces satisfaction with real human interactions.
Sycophantic AI makes human interaction feel more effortful and less satisfying over time
cs.HC 2026-05 unverdicted novelty 6.0

Longitudinal experiments show sycophantic AI increases reliance on AI for personal advice and lowers satisfaction with real-world social relationships over time.
MemSearch-o1: Empowering Large Language Models with Reasoning-Aligned Memory Growth in Agentic Search
cs.IR 2026-04 unverdicted novelty 6.0

MemSearch-o1 mitigates memory dilution in agentic LLM search through reasoning-aligned token-level memory growth, retracing with a contribution function, and path reorganization, improving reasoning activation on benchmarks.
MemSearch-o1: Empowering Large Language Models with Reasoning-Aligned Memory Growth in Agentic Search
cs.IR 2026-04 unverdicted novelty 6.0

MemSearch-o1 uses reasoning-aligned memory growth from seed tokens, retracing via contribution functions, and path reorganization to mitigate memory dilution in LLM agentic search.
TSUBASA: Improving Long-Horizon Personalization via Evolving Memory and Self-Learning with Context Distillation
cs.CL 2026-04 unverdicted novelty 6.0

TSUBASA improves long-horizon personalization in LLMs via dynamic memory evolution for writing and context-distillation self-learning for reading, outperforming Mem0 and Memory-R1 on Qwen-3 benchmarks while reducing t...
MemReader: From Passive to Active Extraction for Long-Term Agent Memory
cs.CL 2026-04 unverdicted novelty 6.0

MemReader uses distilled passive and GRPO-trained active extractors to selectively write low-noise long-term memories, outperforming passive baselines on knowledge updating, temporal reasoning, and hallucination tasks.
HingeMem: Boundary Guided Long-Term Memory with Query Adaptive Retrieval for Scalable Dialogues
cs.CL 2026-04 unverdicted novelty 6.0

HingeMem segments dialogue memory via boundary-triggered hyperedges over four elements and applies query-adaptive retrieval, yielding ~20% relative gains and 68% lower QA token cost versus baselines on LOCOMO.
MerNav: A Highly Generalizable Memory-Execute-Review Framework for Zero-Shot Object Goal Navigation
cs.CV 2026-02 unverdicted novelty 6.0

MerNav's Memory-Execute-Review framework improves success rates in zero-shot object goal navigation by 5-8% over baselines on four datasets while outperforming both training-free and supervised methods on key benchmarks.
Understand and Accelerate Memory Processing Pipeline for Disaggregated LLM Inference
cs.DC 2026-03 unverdicted novelty 5.0

Unifying LLM memory optimizations into a Prepare-Compute-Retrieve-Apply pipeline and accelerating it on GPU-FPGA hardware yields up to 2.2x faster inference and 4.7x less energy than GPU-only baselines.
MemOS: A Memory OS for AI System
cs.CL 2025-07 unverdicted novelty 5.0

MemOS introduces a unified memory management framework for LLMs using MemCubes to handle and evolve different memory types for improved controllability and evolvability.
Memory as Metabolism: A Design for Companion Knowledge Systems
cs.AI 2026-04 unverdicted novelty 4.0

This paper designs a companion knowledge system with TRIAGE, DECAY, CONTEXTUALIZE, CONSOLIDATE, and AUDIT operations plus memory gravity and minority-hypothesis retention to give contradictory evidence a path to updat...
A Brief Overview: Agentic Reinforcement Learning In Large Language Models
cs.AI 2026-04 unverdicted novelty 2.0

The paper surveys the conceptual foundations, methodological innovations, challenges, and future directions of agentic reinforcement learning frameworks that embed cognitive capabilities like meta-reasoning and self-r...
A Brief Overview: Agentic Reinforcement Learning In Large Language Models
cs.AI 2026-04 unverdicted novelty 2.0

This review synthesizes conceptual foundations, methods, challenges, and future directions for agentic reinforcement learning in large language models.

Reference graph

Works this paper leans on

155 extracted references · 155 canonical work pages · cited by 16 Pith papers · 25 internal anchors

[1]

A survey on large language model based autonomous agents

Lei Wang, Chen Ma, Xueyang Feng, Zeyu Zhang, Hao Yang, Jingsen Zhang, Zhiyuan Chen, Jiakai Tang, Xu Chen, Yankai Lin, et al. A survey on large language model based autonomous agents. Frontiers of Computer Science, 18(6):186345, 2024

work page 2024
[2]

Personal LLM Agents: Insights and Survey about the Capability, Efficiency and Security

Yuanchun Li, Hao Wen, Weijun Wang, Xiangyu Li, Yizhen Yuan, Guohong Liu, Jiacheng Liu, Wenxing Xu, Xiang Wang, Yi Sun, et al. Personal llm agents: Insights and survey about the capability, efficiency and security. arXiv preprint arXiv:2401.05459, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024
[3]

A Survey of Large Language Models

Wayne Xin Zhao, Kun Zhou, Junyi Li, Tianyi Tang, Xiaolei Wang, Yupeng Hou, Yingqian Min, Beichen Zhang, Junjie Zhang, Zican Dong, et al. A survey of large language models. arXiv preprint arXiv:2303.18223, 1(2), 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[4]

A survey on evaluation of large language models

Yupeng Chang, Xu Wang, Jindong Wang, Yuan Wu, Linyi Yang, Kaijie Zhu, Hao Chen, Xiaoyuan Yi, Cunxiang Wang, Yidong Wang, et al. A survey on evaluation of large language models. ACM transactions on intelligent systems and technology, 15(3):1–45, 2024

work page 2024
[5]

All roads lead to rome: Unveiling the trajectory of recommender systems across the llm era

Bo Chen, Xinyi Dai, Huifeng Guo, Wei Guo, Weiwen Liu, Yong Liu, Jiarui Qin, Ruiming Tang, Yichao Wang, Chuhan Wu, et al. All roads lead to rome: Unveiling the trajectory of recommender systems across the llm era. arXiv preprint arXiv:2407.10081, 2024

work page arXiv 2024
[6]

A survey on multi-turn interaction capabilities of large language models.arXiv preprint arXiv:2501.09959,

Chen Zhang, Xinyi Dai, Yaxiong Wu, Qu Yang, Yasheng Wang, Ruiming Tang, and Yong Liu. A survey on multi-turn interaction capabilities of large language models. arXiv preprint arXiv:2501.09959, 2025

work page arXiv 2025
[7]

A Survey on the Memory Mechanism of Large Language Model based Agents

Zeyu Zhang, Xiaohe Bo, Chen Ma, Rui Li, Xu Chen, Quanyu Dai, Jieming Zhu, Zhenhua Dong, and Ji-Rong Wen. A survey on the memory mechanism of large language model based agents. arXiv preprint arXiv:2404.13501, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024
[8]

Long term memory: The foundation of ai self-evolution

Xun Jiang, Feng Li, Han Zhao, Jiaying Wang, Jun Shao, Shihao Xu, Shu Zhang, Weiling Chen, Xavier Tang, Yize Chen, et al. Long term memory: The foundation of ai self-evolution. arXiv preprint arXiv:2410.15665, 2024

work page arXiv 2024
[9]

Human physiology: from cells to systems

Lauralee Sherwood, Robert Thomas Kell, and Christopher Ward. Human physiology: from cells to systems. Thomson/Brooks/Cole, 2004

work page 2004
[10]

Llm-powered autonomous agents

Lilian Weng. Llm-powered autonomous agents. lilianweng.github.io, Jun 2023

work page 2023
[11]

Why we forget and how to remember better: the science behind memory

Andrew E Budson and Elizabeth A Kensinger. Why we forget and how to remember better: the science behind memory. Oxford University Press, 2023

work page 2023
[12]

Working memory, thought, and action, volume 45

Alan Baddeley. Working memory, thought, and action, volume 45. OuP Oxford, 2007. 17

work page 2007
[13]

Hipporag: Neurobiologically inspired long-term memory for large language models

Bernal Jiménez Gutiérrez, Yiheng Shu, Yu Gu, Michihiro Yasunaga, and Yu Su. Hipporag: Neurobiologically inspired long-term memory for large language models. arXiv preprint arXiv:2405.14831, 2024

work page arXiv 2024
[14]

Exploring synaptic resonance in large language models: A novel approach to contextual memory integration

George Applegarth, Christian Weatherstone, Maximilian Hollingsworth, Henry Middlebrook, and Marcus Irvin. Exploring synaptic resonance in large language models: A novel approach to contextual memory integration. arXiv preprint arXiv:2502.10699, 2025

work page arXiv 2025
[15]

Key-value memory in the brain.arXiv preprint arXiv:2501.02950, 2025

Samuel J Gershman, Ila Fiete, and Kazuki Irie. Key-value memory in the brain.arXiv preprint arXiv:2501.02950, 2025

work page arXiv 2025
[16]

Ad- vances and challenges in foundation agents: From brain-inspired intelligence to evolutionary, collaborative, and safe systems, 2025

Bang Liu, Xinfeng Li, Jiayi Zhang, Jinlin Wang, Tanjin He, Sirui Hong, Hongzhang Liu, Shaokun Zhang, Kaitao Song, Kunlun Zhu, Yuheng Cheng, Suyuchen Wang, Xiaoqiang Wang, Yuyu Luo, Haibo Jin, Peiyan Zhang, Ollie Liu, Jiaqi Chen, Huan Zhang, Zhaoyang Yu, Haochen Shi, Boyan Li, Dekun Wu, Fengwei Teng, Xiaojun Jia, Jiawei Xu, Jinyu Xi- ang, Yizhang Lin, Tian...

work page 2025
[17]

Memorybank: Enhanc- ing large language models with long-term memory

Wanjun Zhong, Lianghong Guo, Qiqi Gao, He Ye, and Yanlin Wang. Memorybank: Enhanc- ing large language models with long-term memory. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 38, pages 19724–19731, 2024

work page 2024
[18]

Memory and new controls for chatgpt

OpenAI. Memory and new controls for chatgpt. openai.com, February 2024

work page 2024
[19]

Introducing apple intelligence, the personal intelligence system that puts powerful generative models at the core of iphone, ipad, and mac

Apple. Introducing apple intelligence, the personal intelligence system that puts powerful generative models at the core of iphone, ipad, and mac. apple.com, June 2024

work page 2024
[20]

mem0: The memory layer for personalized ai

mem0ai. mem0: The memory layer for personalized ai. mem0.ai, July 2024

work page 2024
[21]

Memoryscope: Equip your llm chatbot with a powerful and flexible long term memory system

ModelScope. Memoryscope: Equip your llm chatbot with a powerful and flexible long term memory system. github.com, September 2024

work page 2024
[22]

Human memory: A proposed system and its control processes

Richard C Atkinson and Richard M Shiffrin. Human memory: A proposed system and its control processes. In Psychology of learning and motivation, volume 2, pages 89–195. Else- vier, 1968

work page 1968
[23]

Chain-of-thought prompting elicits reasoning in large language models

Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Fei Xia, Ed Chi, Quoc V Le, Denny Zhou, et al. Chain-of-thought prompting elicits reasoning in large language models. Advances in neural information processing systems, 35:24824–24837, 2022

work page 2022
[24]

ReAct: Synergizing Reasoning and Acting in Language Models

Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik Narasimhan, and Yuan Cao. React: Synergizing reasoning and acting in language models. arXiv preprint arXiv:2210.03629, 2022

work page internal anchor Pith review Pith/arXiv arXiv 2022
[25]

Retrieval- augmented generation for knowledge-intensive nlp tasks

Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen-tau Yih, Tim Rocktäschel, et al. Retrieval- augmented generation for knowledge-intensive nlp tasks. Advances in neural information processing systems, 33:9459–9474, 2020

work page 2020
[26]

Introducing chatgpt

OpenAI. Introducing chatgpt. openai.com, November 2022

work page 2022
[27]

DeepSeek-V3 Technical Report

Aixin Liu, Bei Feng, Bing Xue, Bingxuan Wang, Bochao Wu, Chengda Lu, Chenggang Zhao, Chengqi Deng, Chenyu Zhang, Chong Ruan, et al. Deepseek-v3 technical report. arXiv preprint arXiv:2412.19437, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024
[28]

Introducing claude

Anthropic. Introducing claude. anthropic.com, March 2023

work page 2023
[29]

Qwen Technical Report

Jinze Bai, Shuai Bai, Yunfei Chu, Zeyu Cui, Kai Dang, Xiaodong Deng, Yang Fan, Wenbin Ge, Yu Han, Fei Huang, et al. Qwen technical report.arXiv preprint arXiv:2309.16609, 2023. 18

work page internal anchor Pith review Pith/arXiv arXiv 2023
[30]

Llama 2: Open Foundation and Fine-Tuned Chat Models

Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, Amjad Almahairi, Yasmine Babaei, Nikolay Bashlykov, Soumya Batra, Prajjwal Bhargava, Shruti Bhosale, et al. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[31]

Gemini: A Family of Highly Capable Multimodal Models

Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Sori- cut, Johan Schalkwyk, Andrew M Dai, Anja Hauth, Katie Millican, et al. Gemini: a family of highly capable multimodal models. arXiv preprint arXiv:2312.11805, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[32]

Pangu-bot: Efficient generative dialogue pre-training from pre-trained language model

Fei Mi, Yitong Li, Yulong Zeng, Jingyan Zhou, Yasheng Wang, Chuanfei Xu, Lifeng Shang, Xin Jiang, Shiqi Zhao, and Qun Liu. Pangu-bot: Efficient generative dialogue pre-training from pre-trained language model. arXiv preprint arXiv:2203.17090, 2022

work page arXiv 2022
[33]

ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools

Team GLM, Aohan Zeng, Bin Xu, Bowen Wang, Chenhui Zhang, Da Yin, Dan Zhang, Diego Rojas, Guanyu Feng, Hanlin Zhao, et al. Chatglm: A family of large language models from glm-130b to glm-4 all tools. arXiv preprint arXiv:2406.12793, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024
[34]

Openas- sistant conversations-democratizing large language model alignment

Andreas Köpf, Yannic Kilcher, Dimitri V on Rütte, Sotiris Anagnostidis, Zhi Rui Tam, Keith Stevens, Abdullah Barhoum, Duc Nguyen, Oliver Stanley, Richárd Nagyfi, et al. Openas- sistant conversations-democratizing large language model alignment. Advances in Neural Information Processing Systems, 36:47669–47681, 2023

work page 2023
[35]

Recall overview

Microsoft. Recall overview. microsoft.com, February 2025

work page 2025
[36]

Ai-native memory: A pathway from llms towards agi

Jingbo Shang, Zai Zheng, Jiale Wei, Xiang Ying, Felix Tao, and Mindverse Team. Ai-native memory: A pathway from llms towards agi. arXiv preprint arXiv:2406.18312, 2024

work page arXiv 2024
[37]

Beyond short-term memory: How memary makes chatbots remember

Memary. Beyond short-term memory: How memary makes chatbots remember. github.com, April 2024

work page 2024
[38]

Langgraph memory service

langchain ai. Langgraph memory service. github.com, October 2024

work page 2024
[39]

Charlie mnemonic

GoodAI. Charlie mnemonic. github.com, March 2024

work page 2024
[40]

Memobase: User profile-based memory for genai apps

memodb io. Memobase: User profile-based memory for genai apps. memobase.io, January 2025

work page 2025
[41]

Letta-AI. Letta. github.com, September 2024

work page 2024
[42]

Cognee.ai. Cognee. github.com, October 2024

work page 2024
[43]

Prompted llms as chatbot modules for long open-domain conversation

Gibbeum Lee, V olker Hartmann, Jongho Park, Dimitris Papailiopoulos, and Kangwook Lee. Prompted llms as chatbot modules for long open-domain conversation. arXiv preprint arXiv:2305.04533, 2023

work page arXiv 2023
[44]

Ret-llm: Towards a general read-write memory for large language models

Ali Modarressi, Ayyoob Imani, Mohsen Fayyaz, and Hinrich Schütze. Ret-llm: Towards a general read-write memory for large language models. arXiv preprint arXiv:2305.14322, 2023

work page arXiv 2023
[45]

MemGPT: Towards LLMs as Operating Systems

Charles Packer, Sarah Wooders, Kevin Lin, Vivian Fang, Shishir G Patil, Ion Stoica, and Joseph E Gonzalez. Memgpt: Towards llms as operating systems. arXiv preprint arXiv:2310.08560, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[46]

Knowledge graph tuning: Real-time large language model personalization based on human feedback

Jingwei Sun, Zhixu Du, and Yiran Chen. Knowledge graph tuning: Real-time large language model personalization based on human feedback. arXiv preprint arXiv:2405.19686, 2024

work page arXiv 2024
[47]

Person- alized large language model assistant with evolving conditional memory

Ruifeng Yuan, Shichao Sun, Yongqi Li, Zili Wang, Ziqiang Cao, and Wenjie Li. Person- alized large language model assistant with evolving conditional memory. arXiv preprint arXiv:2312.17257, 2023

work page arXiv 2023
[48]

On memory construction and retrieval for personalized conversational agents

Zhuoshi Pan, Qianhui Wu, Huiqiang Jiang, Xufang Luo, Hao Cheng, Dongsheng Li, Yuqing Yang, Chin-Yew Lin, H Vicky Zhao, Lili Qiu, et al. On memory construction and retrieval for personalized conversational agents. arXiv preprint arXiv:2502.05589, 2025. 19

work page arXiv 2025
[49]

Memory3: Language modeling with explicit memory

Hongkang Yang, Zehao Lin, Wenjin Wang, Hao Wu, Zhiyu Li, Bo Tang, Wenqiang Wei, Jinbo Wang, Zeyun Tang, Shichao Song, et al. Memory3: Language modeling with explicit memory. arXiv preprint arXiv:2407.01178, 2024

work page arXiv 2024
[50]

Meminsight: Autonomous memory augmentation for llm agents

Rana Salama, Jason Cai, Michelle Yuan, Anna Currey, Monica Sunkara, Yi Zhang, and Yassine Benajiba. Meminsight: Autonomous memory augmentation for llm agents. arXiv preprint arXiv:2503.21760, 2025

work page arXiv 2025
[51]

Memochat: Tuning llms to use memos for consistent long-range open-domain conversation

Junru Lu, Siyu An, Mingbao Lin, Gabriele Pergola, Yulan He, Di Yin, Xing Sun, and Yun- sheng Wu. Memochat: Tuning llms to use memos for consistent long-range open-domain conversation. arXiv preprint arXiv:2308.08239, 2023

work page arXiv 2023
[52]

In prospect and retrospect: Reflective memory manage- ment for long-term personalized dialogue agents

Zhen Tan, Jun Yan, I Hsu, Rujun Han, Zifeng Wang, Long T Le, Yiwen Song, Yanfei Chen, Hamid Palangi, George Lee, et al. In prospect and retrospect: Reflective memory manage- ment for long-term personalized dialogue agents. arXiv preprint arXiv:2503.08026, 2025

work page arXiv 2025
[53]

Hello again! llm-powered personalized agent for long-term dialogue

Hao Li, Chenghao Yang, An Zhang, Yang Deng, Xiang Wang, and Tat-Seng Chua. Hello again! llm-powered personalized agent for long-term dialogue. arXiv preprint arXiv:2406.05925, 2024

work page arXiv 2024
[54]

A-MEM: Agentic Memory for LLM Agents

Wujiang Xu, Zujie Liang, Kai Mei, Hang Gao, Juntao Tan, and Yongfeng Zhang. A-mem: Agentic memory for llm agents. arXiv preprint arXiv:2502.12110, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[55]

Generative agents: Interactive simulacra of human behavior

Joon Sung Park, Joseph O’Brien, Carrie Jun Cai, Meredith Ringel Morris, Percy Liang, and Michael S Bernstein. Generative agents: Interactive simulacra of human behavior. In Pro- ceedings of the 36th annual acm symposium on user interface software and technology, pages 1–22, 2023

work page 2023
[56]

Crafting personalized agents through retrieval-augmented generation on editable memory graphs

Zheng Wang, Zhongyang Li, Zeren Jiang, Dandan Tu, and Wei Shi. Crafting personalized agents through retrieval-augmented generation on editable memory graphs. arXiv preprint arXiv:2409.19401, 2024

work page arXiv 2024
[57]

Recursively summarizing enables long-term dialogue memory in large language models

Qingyue Wang, Liang Ding, Yanan Cao, Zhiliang Tian, Shi Wang, Dacheng Tao, and Li Guo. Recursively summarizing enables long-term dialogue memory in large language models. arXiv preprint arXiv:2308.15022, 2023

work page arXiv 2023
[58]

Compress to im- press: Unleashing the potential of compressive memory in real-world long-term conversa- tions

Nuo Chen, Hongguang Li, Juhua Huang, Baoyuan Wang, and Jia Li. Compress to im- press: Unleashing the potential of compressive memory in real-world long-term conversa- tions. arXiv preprint arXiv:2402.11975, 2024

work page arXiv 2024
[59]

Chatdb: Aug- menting llms with databases as their symbolic memory

Chenxu Hu, Jie Fu, Chenzhuang Du, Simian Luo, Junbo Zhao, and Hang Zhao. Chatdb: Aug- menting llms with databases as their symbolic memory. arXiv preprint arXiv:2306.03901 , 2023

work page arXiv 2023
[60]

my agent understands me better

Yuki Hou, Haruki Tamoto, and Homei Miyashita. " my agent understands me better": In- tegrating dynamic human-like memory recall and consolidation in llm-based agents. In Ex- tended Abstracts of the CHI Conference on Human Factors in Computing Systems, pages 1–7, 2024

work page 2024
[61]

From RAG to Memory: Non-Parametric Continual Learning for Large Language Models

Bernal Jiménez Gutiérrez, Yiheng Shu, Weijian Qi, Sizhe Zhou, and Yu Su. From rag to memory: Non-parametric continual learning for large language models. arXiv preprint arXiv:2502.14802, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[62]

Egolife: Towards egocentric life assistant

Jingkang Yang, Shuai Liu, Hongming Guo, Yuhao Dong, Xiamengwei Zhang, Sicheng Zhang, Pengyun Wang, Zitang Zhou, Binzhu Xie, Ziyue Wang, et al. Egolife: Towards egocentric life assistant. arXiv preprint arXiv:2503.03803, 2025

work page arXiv 2025
[63]

Memocrs: Memory-enhanced sequential conversational recommender systems with large language models

Yunjia Xi, Weiwen Liu, Jianghao Lin, Bo Chen, Ruiming Tang, Weinan Zhang, and Yong Yu. Memocrs: Memory-enhanced sequential conversational recommender systems with large language models. In Proceedings of the 33rd ACM International Conference on Information and Knowledge Management, pages 2585–2595, 2024. 20

work page 2024
[64]

Recmind: Large language model powered agent for recommendation

Yancheng Wang, Ziyan Jiang, Zheng Chen, Fan Yang, Yingxue Zhou, Eunah Cho, Xing Fan, Xiaojiang Huang, Yanbin Lu, and Yingzhen Yang. Recmind: Large language model powered agent for recommendation. arXiv preprint arXiv:2308.14296, 2023

work page arXiv 2023
[65]

Recagent: A novel simulation paradigm for recommender systems

Lei Wang, Jingsen Zhang, Xu Chen, Yankai Lin, Ruihua Song, Wayne Xin Zhao, and Ji-Rong Wen. Recagent: A novel simulation paradigm for recommender systems. arXiv preprint arXiv:2306.02552, 2023

work page arXiv 2023
[66]

Recommender ai agent: Integrating large language models for interactive recommendations

Xu Huang, Jianxun Lian, Yuxuan Lei, Jing Yao, Defu Lian, and Xing Xie. Recommender ai agent: Integrating large language models for interactive recommendations. arXiv preprint arXiv:2308.16505, 2023

work page arXiv 2023
[67]

Enhancing large language model with self-controlled memory frame- work

Bing Wang, Xinnian Liang, Jian Yang, Hui Huang, Shuangzhi Wu, Peihao Wu, Lu Lu, Zejun Ma, and Zhoujun Li. Enhancing large language model with self-controlled memory frame- work. arXiv preprint arXiv:2304.13343, 2023

work page arXiv 2023
[68]

Chatdev: Communicative agents for software develop- ment, 2024

Chen Qian, Wei Liu, Hongzhang Liu, Nuo Chen, Yufan Dang, Jiahao Li, Cheng Yang, Weize Chen, Yusheng Su, Xin Cong, et al. Chatdev: Communicative agents for software develop- ment, 2024. URL https://arxiv. org/abs/2307, 7924, 2024

work page 2024
[69]

Metaagents: Simulating interactions of human behaviors for llm-based task-oriented coordination via collaborative generative agents

Yuan Li, Yixuan Zhang, and Lichao Sun. Metaagents: Simulating interactions of human behaviors for llm-based task-oriented coordination via collaborative generative agents. arXiv preprint arXiv:2310.06500, 2023

work page arXiv 2023
[70]

S$^3$: Social-network Simulation System with Large Language Model-Empowered Agents

Chen Gao, Xiaochong Lan, Zhihong Lu, Jinzhu Mao, Jinghua Piao, Huandong Wang, De- peng Jin, and Yong Li. S 3: Social-network simulation system with large language model- empowered agents. arXiv preprint arXiv:2307.14984, 2023

work page internal anchor Pith review arXiv 2023
[71]

Tradinggpt: Multi- agent system with layered memory and distinct characters for enhanced financial trading per- formance

Yang Li, Yangyang Yu, Haohang Li, Zhi Chen, and Khaldoun Khashanah. Tradinggpt: Multi- agent system with layered memory and distinct characters for enhanced financial trading per- formance. arXiv preprint arXiv:2309.03736, 2023

work page arXiv 2023
[72]

Memolet: Reifying the reuse of user-ai conversational memories

Ryan Yen and Jian Zhao. Memolet: Reifying the reuse of user-ai conversational memories. In Proceedings of the 37th Annual ACM Symposium on User Interface Software and Technology, pages 1–22, 2024

work page 2024
[73]

Memreasoner: A memory-augmented llm architecture for multi-hop reasoning

Ching-Yun Ko, Sihui Dai, Payel Das, Georgios Kollias, Subhajit Chaudhury, and Aurelie Lozano. Memreasoner: A memory-augmented llm architecture for multi-hop reasoning. In The First Workshop on System-2 Reasoning at Scale, NeurIPS’24, 2024

work page 2024
[74]

Madial- bench: Towards real-world evaluation of memory-augmented dialogue generation

Junqing He, Liang Zhu, Rui Wang, Xi Wang, Reza Haffari, and Jiaxing Zhang. Madial- bench: Towards real-world evaluation of memory-augmented dialogue generation. arXiv preprint arXiv:2409.15240, 2024

work page arXiv 2024
[75]

Evaluating Very Long-Term Conversational Memory of LLM Agents

Adyasha Maharana, Dong-Ho Lee, Sergey Tulyakov, Mohit Bansal, Francesco Barbieri, and Yuwei Fang. Evaluating very long-term conversational memory of llm agents.arXiv preprint arXiv:2402.17753, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024
[76]

Memsim: A bayesian simulator for evaluating memory of llm-based personal assistants

Zeyu Zhang, Quanyu Dai, Luyu Chen, Zeren Jiang, Rui Li, Jieming Zhu, Xu Chen, Yi Xie, Zhenhua Dong, and Ji-Rong Wen. Memsim: A bayesian simulator for evaluating memory of llm-based personal assistants. arXiv preprint arXiv:2409.20163, 2024

work page arXiv 2024
[77]

Inter- personal memory matters: A new task for proactive dialogue utilizing conversational history

Bowen Wu, Wenqing Wang, Haoran Li, Ying Li, Jingsong Yu, and Baoxun Wang. Inter- personal memory matters: A new task for proactive dialogue utilizing conversational history. arXiv preprint arXiv:2503.05150, 2025

work page arXiv 2025
[78]

Beyond goldfish memory: Long-term open- domain conversation

Jing Xu, Arthur Szlam, and Jason Weston. Beyond goldfish memory: Long-term open- domain conversation. In Smaranda Muresan, Preslav Nakov, and Aline Villavicencio, edi- tors, Proceedings of the 60th Annual Meeting of the Association for Computational Linguis- tics (Volume 1: Long Papers), pages 5180–5197, Dublin, Ireland, May 2022. Association for Computat...

work page 2022
[79]

Mmrc: A large-scale benchmark for understanding multimodal large language model in real-world conversation

Haochen Xue, Feilong Tang, Ming Hu, Yexin Liu, Qidong Huang, Yulong Li, Chengzhi Liu, Zhongxing Xu, Chong Zhang, Chun-Mei Feng, et al. Mmrc: A large-scale benchmark for understanding multimodal large language model in real-world conversation. arXiv preprint arXiv:2502.11903, 2025

work page arXiv 2025
[80]

Ego4d: Around the world in 3,000 hours of egocentric video

Kristen Grauman, Andrew Westbury, Eugene Byrne, Zachary Chavis, Antonino Furnari, Ro- hit Girdhar, Jackson Hamburger, Hao Jiang, Miao Liu, Xingyu Liu, et al. Ego4d: Around the world in 3,000 hours of egocentric video. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 18995–19012, 2022

work page 2022

Showing first 80 references.