{"total":13,"items":[{"citing_arxiv_id":"2605.18854","ref_index":8,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Evaluating Memory Condensation Strategies for Coding Agents in Data-Driven Scientific Discovery","primary_cat":"cs.LG","submitted_at":"2026-05-13T13:10:41+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Empirical evaluation of eight memory condensation strategies on 480 DiscoveryBench tasks finds no significant impact on hypothesis quality but domain-dependent differences in token efficiency.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.09942","ref_index":21,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"HAGE: Harnessing Agentic Memory via RL-Driven Weighted Graph Evolution","primary_cat":"cs.AI","submitted_at":"2026-05-11T03:41:18+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"HAGE proposes a trainable weighted graph memory framework with LLM intent classification, dynamic edge modulation, and RL optimization that improves long-horizon reasoning accuracy in agentic LLMs over static baselines.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.07358","ref_index":36,"ref_count":4,"confidence":0.9,"is_internal_anchor":false,"paper_title":"A Comprehensive Survey on Agent Skills: Taxonomy, Techniques, and Applications","primary_cat":"cs.IR","submitted_at":"2026-05-08T07:10:26+00:00","verdict":null,"verdict_confidence":null,"novelty_score":null,"formal_verification":null,"one_line_summary":null,"context_count":2,"top_context_role":"background","top_context_polarity":"background","context_text":"suitable for storage, retrieval, and orchestration. In practice, JOURNAL OF LATEX CLASS FILES, VOL. 18, NO. 9, SEPTEMBER 2020 4 Agent Skills Skill Representation (§III) Text-Based Reflexion [19], ExpeL [23], BoT [24], ReasoningBank [25], AWM [26], Trace2Skill [27], SayCan [28] , DEPS [29] , Generative Agents [30], GITM [31], RAP [32], Retroformer [33], MemGPT [34], TiM [35], Self-Discover [36], TextGrad [37], FINCON [38], M+ [39], Learned Memory Bank [40], Nemori [41], Intrinsic Memory [42], SkillForge [43] Code-Backed V oyager [12], SkillCraft [44], PolySkill [45], ASI [46], CUA-Skill [47], MetaGPT [6], Eureka [48], DS-Agent [49], LDB [50], CodeAct [51], SWE-agent [52], ToolCoder [53], PSN [54] Hybrid-BasedJARVIS-1 [55], Synapse [56], SkillWeaver [57], AgentSkillOS [58], TPTU [59], talker-reasoner [60], DAMCS [61],"},{"citing_arxiv_id":"2605.05704","ref_index":3,"ref_count":2,"confidence":0.9,"is_internal_anchor":false,"paper_title":"SafeHarbor: Hierarchical Memory-Augmented Guardrail for LLM Agent Safety","primary_cat":"cs.CR","submitted_at":"2026-05-07T05:50:45+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"SafeHarbor introduces a hierarchical memory-augmented guardrail with adversarial rule extraction and entropy-driven self-evolution to balance safety and utility in LLM agents.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.01970","ref_index":46,"ref_count":2,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Trojan Hippo: Weaponizing Agent Memory for Data Exfiltration","primary_cat":"cs.CR","submitted_at":"2026-05-03T17:07:20+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"The paper defines and evaluates Trojan Hippo attacks on LLM agent memory, showing 85-100% success in data exfiltration across backends and reduced rates with defenses at varying utility costs.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"Debeshee Das, Julien Piet, Darya Kaviani, Luca Beurer-Kellner, Florian Tramèr, and David Wagner feedback on why it failed and how to improve. Better candidates replace weaker ones in their section of the diversity grid. The loop stops when we reach a perfect score or a cap on iterations [48, 52]. Train and test cases are generated using a modified version of the data generation pipeline from LoCoMo [47], a widely adopted benchmark for evaluating long-term conversational memory in LLM agents. LoCoMo's pipeline generates rich, multi-session con- versational histories grounded in persona statements and temporal event graphs, making it a natural foundation for our benign session content. However, LoCoMo's original format-dialogues between two virtual agents-does not directly fit our email-assistant setting,"},{"citing_arxiv_id":"2605.00702","ref_index":137,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Learning How and What to Memorize: Cognition-Inspired Two-Stage Optimization for Evolving Memory","primary_cat":"cs.CL","submitted_at":"2026-05-01T14:45:20+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"MemCoE learns memory organization guidelines via contrastive feedback and then trains a guideline-aligned RL policy for memory updates, yielding consistent gains on personalization benchmarks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.06845","ref_index":30,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"HingeMem: Boundary Guided Long-Term Memory with Query Adaptive Retrieval for Scalable Dialogues","primary_cat":"cs.CL","submitted_at":"2026-04-08T09:07:07+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"HingeMem segments dialogue memory via boundary-triggered hyperedges over four elements and applies query-adaptive retrieval, yielding ~20% relative gains and 68% lower QA token cost versus baselines on LOCOMO.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.02522","ref_index":136,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Opal: Private Memory for Personal AI","primary_cat":"cs.CR","submitted_at":"2026-04-02T21:23:00+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Opal enables private long-term memory for personal AI by decoupling reasoning to a trusted enclave with a lightweight knowledge graph and piggybacking reindexing on ORAM accesses.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"[134] Jingbiao Mei, Jinghong Chen, Guangyu Yang, Xinyu Hou, Margaret Li, and Bill Byrne. 2026. According to Me: Long-Term Personalized Referential Memory QA.arXiv preprint arXiv:2603.01990(2026). [135] Carlo Meijer and Bernard van Gastel. 2019. Self-Encrypting Decep- tion: Weaknesses in the Encryption of Solid State Drives. InIEEE Symposium on Security and Privacy. IEEE, 72-87. [136] Jesse De Meulemeester, Luca Wilke, David F. Oswald, Thomas Eisen- barth, Ingrid Verbauwhede, and Jo Van Bulck. 2025. BadRAM: Practi- cal Memory Aliasing Attacks on Trusted Execution Environments. InSP. IEEE, 4117-4135. [137] Microsoft. 2025. Breaking Down the Infinite Workday. https://www.microsoft.com/en-us/worklab/work-trend- index/breaking-down-infinite-workday."},{"citing_arxiv_id":"2604.19756","ref_index":2,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"WorkflowGen:an adaptive workflow generation mechanism driven by trajectory experience","primary_cat":"cs.LG","submitted_at":"2026-03-22T16:49:58+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"WorkflowGen reuses trajectory experiences via node-level and workflow-level extraction plus three-tier semantic routing to cut token use over 40% and raise success 20% on medium-similarity queries versus real-time planning baselines.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2504.15965","ref_index":100,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"From Human Memory to AI Memory: A Survey on Memory Mechanisms in the Era of LLMs","primary_cat":"cs.IR","submitted_at":"2025-04-22T15:05:04+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"The paper surveys human memory categories, maps them to LLM memory, and proposes a new three-dimension (object, form, time) categorization into eight quadrants to organize existing work and highlight open problems.","context_count":1,"top_context_role":"method","top_context_polarity":"background","context_text":"we examine system memory and its associated research from both non-parametric and parametric perspectives. Quadrant Dimension Feature Models V System Non-Parametric Short-Term Reasoning & Planning Enhancement ReAct [24], RAP [94], Reflexion [95], Talker-Reasoner [96], TPTU [97] VI System Non-Parametric Long-Term Reflection & Refinement Buffer of Thoughts [98], AWM [99], Think-in-Memory [100], GITM [101], V oyager [102], Retroformer [103], Expel [104], Synapse [105], MetaGPT [106], Learned Memory Bank [107], M+ [108] VII System Parametric Short-Term KV Management LookupFFN [109], ChunkKV [110], vLLM [111], FastServe [112], StreamingLLM [113], Orca [114], DistServe [115], LLM.int8() [116], FastGen [117], Train Large, Then Compress [118], Scissorhands [119],"},{"citing_arxiv_id":"2504.01990","ref_index":261,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems","primary_cat":"cs.AI","submitted_at":"2025-03-31T18:00:29+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":2.0,"formal_verification":"none","one_line_summary":"This survey frames foundation agents using brain-inspired modular architectures and reviews challenges in evolution, collaboration, and safety.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2404.13501","ref_index":97,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"A Survey on the Memory Mechanism of Large Language Model based Agents","primary_cat":"cs.AI","submitted_at":"2024-04-21T01:49:46+00:00","verdict":"ACCEPT","verdict_confidence":"MODERATE","novelty_score":3.0,"formal_verification":"none","one_line_summary":"A systematic review of memory designs, evaluation methods, applications, limitations, and future directions for LLM-based agents.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"the information across different trials, and the external knowledge. The former two are dynamically 11 Table 1: Summarization of the memory sources. We use ✓ and × to label whether or not the corresponding source is adopted in the model. Models Inside-trial Information Cross-trial Information External Knowledge MemoryBank [6] ✓ × × RET-LLM [7] ✓ × ✓ ChatDB [96] ✓ × ✓ TiM [97] ✓ × × SCM [98] ✓ × × V oyager [99] ✓ × × MemGPT [100] ✓ × × MemoChat [94] ✓ × × MPC [101] ✓ × × Generative Agents [83] ✓ × × RecMind [102] ✓ × ✓ Retroformer [103] ✓ ✓ ✓ ExpeL [82] ✓ ✓ ✓ Synapse [91] ✓ ✓ × GITM [93] ✓ ✓ ✓ ReAct [104] ✓ × ✓ Reflexion [5] ✓ ✓ ✓ RecAgent [95] ✓ × × Character-LLM [105] ✓ × ✓ MAC [106] ✓ × × Huatuo [107] ✓ × ✓ ChatDev [1] ✓ × ×"},{"citing_arxiv_id":"2402.02716","ref_index":26,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Understanding the planning of LLM agents: A survey","primary_cat":"cs.AI","submitted_at":"2024-02-05T04:25:24+00:00","verdict":"ACCEPT","verdict_confidence":"MODERATE","novelty_score":4.0,"formal_verification":"none","one_line_summary":"A survey that provides a taxonomy of methods for improving planning in LLM-based agents across task decomposition, plan selection, external modules, reflection, and memory.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null}],"limit":50,"offset":0}