ACE: Pluggable Adaptive Context Elasticizer across Agents

Junchi Yan; Ning Liao; Rongxiang Weng; Xiaoxing Wang; Xue Yang; Xunliang Cai; Yaoming Wang; Zihao Long; Ziyuan Zhuang

arxiv: 2606.31564 · v1 · pith:BSG65WXZnew · submitted 2026-06-30 · 💻 cs.AI

ACE: Pluggable Adaptive Context Elasticizer across Agents

Ning Liao , Zihao Long , Xiaoxing Wang , Xue Yang , Yaoming Wang , Ziyuan Zhuang , Xunliang Cai , Rongxiang Weng

show 1 more author

Junchi Yan

This is my paper

Pith reviewed 2026-07-01 05:27 UTC · model grok-4.3

classification 💻 cs.AI

keywords adaptive context managementLLM agentscontext windowreversible compressionpluggable moduletrajectory lengthReActagent frameworks

0 comments

The pith

ACE lets LLM agents elastically decide per step to keep raw messages, use abstractions, or drop steps while preserving all data.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

LLM agents face growing trajectory lengths that exceed fixed context windows, and existing truncation or summarization methods discard information irreversibly. ACE introduces a plug-and-play module that maintains both raw and compressed versions of every historical step in a lossless layer. A separate orchestration layer then assigns each step one of three elastic types—raw, abstract, or drop—based solely on the current task state at each decision point. The design was applied unchanged to four agent frameworks and produced consistent gains over baselines. This reversible approach keeps the context compact yet recoverable when earlier details become relevant again.

Core claim

ACE maintains a lossless message maintenance layer that stores both raw messages and compressed abstractions for each historical step, while a context orchestration layer adaptively assigns each step an elastic type as raw, abstract, or drop at every decision step based on the current task state. This reversible design ensures that the main LLM always receives a compact yet information-rich context. The module was integrated into ReAct, DeepAgent, WebThinker, and MiroFlow without training or architectural modifications and outperformed truncation and summarization baselines across all four frameworks.

What carries the argument

Adaptive Context Elasticizer (ACE) with its lossless message maintenance layer and context orchestration layer that assigns raw/abstract/drop labels to historical steps.

If this is right

ACE integrates into existing agent frameworks without training or code changes.
It consistently outperforms truncation and summarization on agent benchmarks.
Performance gains appear across ReAct, DeepAgent, WebThinker, and MiroFlow.
The reversible storage allows information to be restored if it becomes relevant later.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same elastic layer could support agents on tasks whose trajectories exceed current context windows by larger margins.
Similar lossless-plus-orchestration designs might apply to multi-turn chat systems or long document reasoning.
If the rule-based orchestration proves brittle, replacing it with a small learned policy could be tested directly on the existing maintenance layer.

Load-bearing premise

The orchestration layer can reliably assign raw, abstract, or drop labels to each step using only the current task state, without any additional training or domain-specific tuning.

What would settle it

An experiment on one of the four frameworks where the adaptive label assignments produce lower task success rates than a fixed truncation baseline because needed information was dropped early and could not be recovered.

read the original abstract

The increasing complexity of agentic tasks has led to rapidly growing trajectory lengths, which poses significant challenges for large language model (LLM) based agents with fixed context windows. Existing context management techniques, such as truncation and summarization, suffer from inherent inflexibility and irreversibility: once information is discarded or compressed, it cannot be recovered even when it becomes critically relevant in later decision steps. To address these limitations, we propose the Adaptive Context Elasticizer (ACE), a plug-and-play module that elastically orchestrates historical step information into the agent's context at each decision step. ACE maintains a lossless message maintenance layer that stores both raw messages and compressed abstractions for each historical step, while a context orchestration layer adaptively assigns each step an elastic type as raw, abstract, or drop, at every decision step based on the current task state. This reversible design ensures that the main LLM always receives a compact yet information-rich context. We adapt ACE to four diverse agent frameworks, including ReAct, DeepAgent, WebThinker, and MiroFlow, without training or architectural modifications. Experiments show that ACE consistently outperforms truncation and summarization baselines, and brings consistent performance gains across all four agent frameworks.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

ACE's dual raw/abstract storage with per-step elastic typing is a clean engineering idea for reversible context in agents, but the orchestration rules for label assignment remain the weakest link.

read the letter

The paper introduces ACE as a plug-and-play layer that keeps both raw historical messages and their compressed abstractions in parallel, then decides at each agent step whether to feed the raw version, the abstract, or nothing. This setup aims to avoid the permanent loss that comes with truncation or one-way summarization.

It does a solid job framing the problem around long-horizon agent trajectories and showing that the same module can sit on top of ReAct, DeepAgent, WebThinker, and MiroFlow without retraining or code changes to those frameworks. That kind of cross-framework compatibility is useful if the gains are real.

The soft spot is exactly where the stress-test note points: the context orchestration layer has to pick raw/abstract/drop from task state alone, yet the abstract gives no equations, pseudocode, or decision criteria. If those rules turn out to be hand-crafted heuristics that were adjusted per framework or domain, the "no modifications" claim does not fully hold. The abstract also asserts consistent outperformance but supplies none of the numbers, baselines details, or statistical checks, so the empirical strength cannot be judged yet.

The work is aimed at people who build or extend LLM agents and need practical context management rather than new theory. A reader who cares about engineering fixes for context windows would find the dual-storage idea worth examining, provided the full paper spells out the assignment logic and reports the actual results with error bars.

It deserves peer review. The problem is concrete, the reversible design is distinct from prior irreversible baselines, and the multi-framework adaptation is a reasonable test even if the current write-up leaves the decision mechanism underspecified.

Referee Report

2 major / 1 minor

Summary. The paper proposes ACE, a plug-and-play module for LLM-based agents that maintains a lossless layer storing both raw messages and compressed abstractions for each historical step while using a context orchestration layer to assign each step an elastic type (raw, abstract, or drop) at every decision step based solely on the current task state. The design is claimed to be reversible and information-rich. The authors adapt ACE to four agent frameworks (ReAct, DeepAgent, WebThinker, MiroFlow) with no training or architectural modifications and report that it consistently outperforms truncation and summarization baselines across all four.

Significance. If the orchestration layer generalizes without hidden tuning and the reported gains are robust, ACE would offer a practical, reversible alternative to irreversible context management techniques, addressing a growing bottleneck in long-horizon agent trajectories.

major comments (2)

[Method (context orchestration layer description)] The central claim that ACE works 'without training or architectural modifications' and 'without any additional ... domain-specific tuning' rests on the context orchestration layer's ability to assign raw/abstract/drop labels reliably from task state alone. No decision rules, pseudocode, or state-to-label mapping is supplied, so the generality of the plug-and-play property cannot be evaluated.
[Experiments section] The abstract states that 'Experiments show that ACE consistently outperforms truncation and summarization baselines' across four frameworks, yet the manuscript supplies no quantitative results, tables of success rates, error bars, statistical tests, or implementation details of the baselines, preventing assessment of the empirical claim.

minor comments (1)

Define all acronyms (e.g., ReAct, ACE) on first use and ensure consistent terminology between the abstract and body.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments, which highlight areas where additional detail will strengthen the manuscript. We address each major comment below and commit to revisions that provide the requested information without altering the core claims.

read point-by-point responses

Referee: [Method (context orchestration layer description)] The central claim that ACE works 'without training or architectural modifications' and 'without any additional ... domain-specific tuning' rests on the context orchestration layer's ability to assign raw/abstract/drop labels reliably from task state alone. No decision rules, pseudocode, or state-to-label mapping is supplied, so the generality of the plug-and-play property cannot be evaluated.

Authors: We agree that explicit decision rules for the context orchestration layer are necessary to fully substantiate the plug-and-play property. The revised manuscript will include a new subsection with pseudocode and a clear state-to-label mapping. The rules are deterministic heuristics based on task-state features (e.g., step relevance to current goal and remaining context budget) and require no training or domain-specific tuning, consistent with the original design. revision: yes
Referee: [Experiments section] The abstract states that 'Experiments show that ACE consistently outperforms truncation and summarization baselines' across four frameworks, yet the manuscript supplies no quantitative results, tables of success rates, error bars, statistical tests, or implementation details of the baselines, preventing assessment of the empirical claim.

Authors: We acknowledge that the current manuscript draft does not present the quantitative results with sufficient detail. In the revision we will expand the experiments section to include full tables of success rates for all four frameworks, baseline implementations, error bars from repeated runs, and statistical tests, allowing direct evaluation of the reported gains. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical engineering claim with no derivation chain

full rationale

The paper presents ACE as a plug-and-play engineering module validated through experiments on four agent frameworks. No equations, parameters, self-citations, or mathematical derivations appear in the abstract or description. The central claim is an empirical performance result rather than a derivation that reduces to its own inputs by construction. The orchestration layer's decision rules are described at a high level but not derived from prior results or fitted values within the paper.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No mathematical axioms, free parameters, or invented physical entities; the contribution is an architectural module whose correctness rests on empirical performance rather than formal derivation.

pith-pipeline@v0.9.1-grok · 5760 in / 1140 out tokens · 23726 ms · 2026-07-01T05:27:57.232865+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

24 extracted references · 22 canonical work pages · 12 internal anchors

[1]

Longcat-flash technical report.arXiv preprint arXiv:2509.01322, 2025a

Meituan LongCat Team, Bei Li, Bingye Lei, Bo Wang, Bolin Rong, Chao Wang, Chao Zhang, Chen Gao, Chen Zhang, Cheng Sun, et al. Longcat-flash technical report.arXiv preprint arXiv:2509.01322, 2025a. AohanZeng, XinLv, ZhenyuHou, ZhengxiaoDu, QinkaiZheng, BinChen, DaYin, ChendiGe, ChenghuaHuang, Chengxing Xie, et al. Glm-5: from vibe coding to agentic enginee...

work page arXiv
[2]

Kimi K2: Open Agentic Intelligence

10 ACE: Pluggable Adaptive Context Elasticizer across AgentsA Preprint Kimi Team, Yifan Bai, Yiping Bao, Y Charles, Cheng Chen, Guanduo Chen, Haiting Chen, Huarong Chen, Jiahao Chen, Ningxin Chen, et al. Kimi k2: Open agentic intelligence.arXiv preprint arXiv:2507.20534, 2025b. Aili Chen, Aonian Li, Baichuan Zhou, Bangwei Gong, Binyang Jiang, Boji Dan, Ch...

work page internal anchor Pith review Pith/arXiv arXiv
[3]

Autoagent: Evolving cognition and elastic memory orchestration for adaptive agents.arXiv preprint arXiv:2603.09716,

Xiaoxing Wang, Ning Liao, Shikun Wei, Chen Tang, and Feiyu Xiong. Autoagent: Evolving cognition and elastic memory orchestration for adaptive agents.arXiv preprint arXiv:2603.09716,

work page arXiv
[4]

WebGPT: Browser-assisted question-answering with human feedback

Reiichiro Nakano, Jacob Hilton, Suchir Balaji, Jeff Wu, Long Ouyang, Christina Kim, Christopher Hesse, Shantanu Jain, Vineet Kosaraju, William Saunders, et al. Webgpt: Browser-assisted question-answering with human feedback.arXiv preprint arXiv:2112.09332,

work page internal anchor Pith review Pith/arXiv arXiv
[5]

WebSailor: Navigating Super-human Reasoning for Web Agent

Bowen Jin, Hansi Zeng, Zhenrui Yue, Jinsung Yoon, Sercan O Arik, Dong Wang, Hamed Zamani, and Jiawei Han. Search-r1: Training llms to reason and leverage search engines with reinforcement learning. InSecond Conference on Language Modeling. Kuan Li, Zhongwang Zhang, Huifeng Yin, Liwen Zhang, Litu Ou, Jialong Wu, Wenbiao Yin, Baixuan Li, Zhengwei Tao, Xinyu...

work page internal anchor Pith review Pith/arXiv arXiv
[6]

Memory in the Age of AI Agents

Zeyu Zhang, Quanyu Dai, Xiaohe Bo, Chen Ma, Rui Li, Xu Chen, Jieming Zhu, Zhenhua Dong, and Ji-Rong Wen. A survey on the memory mechanism of large language model-based agents.ACM Transactions on Information Systems, 43(6):1–47, 2025a. Yuyang Hu, Shichun Liu, Yanwei Yue, Guibin Zhang, Boyang Liu, Fangyi Zhu, Jiahang Lin, Honglin Guo, Shihan Dou, Zhiheng Xi...

work page internal anchor Pith review Pith/arXiv arXiv
[7]

Memory for autonomous llm agents: Mechanisms, evaluation, and emerging frontiers.arXiv preprint arXiv:2603.07670, 2026

Pengfei Du. Memory for autonomous llm agents: Mechanisms, evaluation, and emerging frontiers.arXiv preprint arXiv:2603.07670,

work page arXiv
[8]

DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models

Aixin Liu, Aoxue Mei, Bangcai Lin, Bing Xue, Bingxuan Wang, Bingzheng Xu, Bochao Wu, Bowei Zhang, Chaofan Lin, Chen Dong, et al. Deepseek-v3. 2: Pushing the frontier of open large language models.arXiv preprint arXiv:2512.02556,

work page internal anchor Pith review Pith/arXiv arXiv
[9]

GenericAgent: A Token-Efficient Self-Evolving LLM Agent via Contextual Information Density Maximization (V1.0)

Jiaqing Liang, Jinyi Han, Weijia Li, Xinyi Wang, Zhoujia Zhang, Zishang Jiang, Ying Liao, Tingyun Li, Ying Huang, Hao Shen, et al. Genericagent: A token-efficient self-evolving llm agent via contextual information density maximization (v1. 0).arXiv preprint arXiv:2604.17091,

work page internal anchor Pith review Pith/arXiv arXiv
[10]

Mirothinker-1.7 & h1: Towards heavy-duty research agents via verification.arXiv preprint arXiv:2603.15726,

MiroMind Team, S Bai, L Bing, L Lei, R Li, X Li, X Lin, E Min, L Su, B Wang, et al. Mirothinker-1.7 & h1: Towards heavy-duty research agents via verification.arXiv preprint arXiv:2603.15726,

work page arXiv
[11]

ACON: Optimizing Context Compression for Long-horizon LLM Agents

Minki Kang, Wei-Ning Chen, Dongge Han, Huseyin A Inan, Lukas Wutschitz, Yanzhi Chen, Robert Sim, and Saravan Rajmohan. Acon: Optimizing context compression for long-horizon llm agents.arXiv preprint arXiv:2510.00615,

work page internal anchor Pith review Pith/arXiv arXiv
[12]

Resum: Unlocking long-horizon search intelligence via context summarization

Xixi Wu, Kuan Li, Yida Zhao, Liwen Zhang, Litu Ou, Huifeng Yin, Zhongwang Zhang, Xinmiao Yu, Dingchu Zhang, Yong Jiang, et al. Resum: Unlocking long-horizon search intelligence via context summarization. arXiv preprint arXiv:2509.13313, 2025a. Weiwei Sun, Miao Lu, Zhan Ling, Kang Liu, Xuesong Yao, Yiming Yang, and Jiecao Chen. Scaling long-horizon llm age...

work page arXiv
[13]

Memobrain: Executive memory as an agentic brain for reasoning

11 ACE: Pluggable Adaptive Context Elasticizer across AgentsA Preprint Hongjin Qian, Zhao Cao, and Zheng Liu. Memobrain: Executive memory as an agentic brain for reasoning. arXiv preprint arXiv:2601.08079,

work page arXiv
[14]

Mem1: Learning to synergize memory and reasoning for efficient long-horizon agents

Zijian Zhou, Ao Qu, Zhaoxuan Wu, Sunghwan Kim, Alok Prakash, Daniela Rus, Jinhua Zhao, Bryan Kian Hsiang Low, and Paul Pu Liang. Mem1: Learning to synergize memory and reasoning for efficient long-horizon agents. InFirst Workshop on Multi-Turn Interactions in Large Language Models. Guoxin Chen, Zile Qiao, Xuanzhong Chen, Donglei Yu, Haotian Xu, Wayne Xin ...

work page arXiv
[15]

Memory as Action: Autonomous Context Curation for Long-Horizon Agentic Tasks

Yuxiang Zhang, Jiangming Shu, Ye Ma, Xueyuan Lin, Shangxi Wu, and Jitao Sang. Memory as action: Autonomous context curation for long-horizon agentic tasks.arXiv preprint arXiv:2510.12635, 2025b. Rui Ye, Zhongwang Zhang, Kuan Li, Huifeng Yin, Zhengwei Tao, Yida Zhao, Liangcai Su, Liwen Zhang, Zile Qiao, Xinyu Wang, et al. Agentfold: Long-horizon web agents...

work page internal anchor Pith review Pith/arXiv arXiv
[16]

Arc: Active and reflection-driven context management for long-horizon information seeking agents.arXiv preprint arXiv:2601.12030,

Yilun Yao, Shan Huang, Elsie Dai, Zhewen Tan, Zhenyu Duan, Shousheng Jia, Yanbing Jiang, and Tong Yang. Arc: Active and reflection-driven context management for long-horizon information seeking agents.arXiv preprint arXiv:2601.12030,

work page arXiv
[17]

Deepagent: A general reasoning agent with scalable toolsets

Xiaoxi Li, Wenxiang Jiao, Jiarui Jin, Guanting Dong, Jiajie Jin, Yinuo Wang, Hao Wang, Yutao Zhu, Ji-Rong Wen, Yuan Lu, et al. Deepagent: A general reasoning agent with scalable toolsets. InProceedings of the ACM Web Conference 2026, pages 2219–2230, 2026a. Xiaoxi Li, Jiajie Jin, Guanting Dong, Hongjin Qian, Yongkang Wu, Ji-Rong Wen, Yutao Zhu, and Zhiche...

work page arXiv 2026
[18]

Longformer: The Long-Document Transformer

Iz Beltagy, Matthew E Peters, and Arman Cohan. Longformer: The long-document transformer.arXiv preprint arXiv:2004.05150,

work page internal anchor Pith review Pith/arXiv arXiv 2004
[19]

Extending Context Window of Large Language Models via Positional Interpolation

Shouyuan Chen, Sherman Wong, Liangjian Chen, and Yuandong Tian. Extending context window of large language models via positional interpolation.arXiv preprint arXiv:2306.15595,

work page internal anchor Pith review Pith/arXiv arXiv
[20]

Yarn: Efficient context window extension of large language models

Bowen Peng, Jeffrey Quesnelle, Honglu Fan, and Enrico Shippole. Yarn: Efficient context window extension of large language models. InInternational Conference on Learning Representations, volume 2024, pages 31932–31951,

2024
[21]

LLMs Get Lost In Multi-Turn Conversation

Philippe Laban, Hiroaki Hayashi, Yingbo Zhou, and Jennifer Neville. Llms get lost in multi-turn conversation. arXiv preprint arXiv:2505.06120,

work page internal anchor Pith review Pith/arXiv arXiv
[22]

Gaia: a benchmark for general ai assistants

Grégoire Mialon, Clémentine Fourrier, Thomas Wolf, Yann LeCun, and Thomas Scialom. Gaia: a benchmark for general ai assistants. InInternational Conference on Learning Representations, volume 2024, pages 9025–9049,

2024
[23]

Humanity's Last Exam

12 ACE: Pluggable Adaptive Context Elasticizer across AgentsA Preprint Long Phan, Alice Gatti, Ziwen Han, Nathaniel Li, Josephina Hu, Hugh Zhang, Chen Bo Calvin Zhang, Mohamed Shaaban, John Ling, Sean Shi, et al. Humanity’s last exam.arXiv preprint arXiv:2501.14249,

work page internal anchor Pith review Pith/arXiv arXiv
[24]

Webwalker: Benchmarking llms in web traversal

Jialong Wu, Wenbiao Yin, Yong Jiang, Zhenglin Wang, Zekun Xi, Runnan Fang, Linhai Zhang, Yulan He, Deyu Zhou, Pengjun Xie, et al. Webwalker: Benchmarking llms in web traversal. InProceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 10290–10305, 2025b. Kaiyuan Chen, Yixin Ren, Yang Liu, Xiao...

work page arXiv

[1] [1]

Longcat-flash technical report.arXiv preprint arXiv:2509.01322, 2025a

Meituan LongCat Team, Bei Li, Bingye Lei, Bo Wang, Bolin Rong, Chao Wang, Chao Zhang, Chen Gao, Chen Zhang, Cheng Sun, et al. Longcat-flash technical report.arXiv preprint arXiv:2509.01322, 2025a. AohanZeng, XinLv, ZhenyuHou, ZhengxiaoDu, QinkaiZheng, BinChen, DaYin, ChendiGe, ChenghuaHuang, Chengxing Xie, et al. Glm-5: from vibe coding to agentic enginee...

work page arXiv

[2] [2]

Kimi K2: Open Agentic Intelligence

10 ACE: Pluggable Adaptive Context Elasticizer across AgentsA Preprint Kimi Team, Yifan Bai, Yiping Bao, Y Charles, Cheng Chen, Guanduo Chen, Haiting Chen, Huarong Chen, Jiahao Chen, Ningxin Chen, et al. Kimi k2: Open agentic intelligence.arXiv preprint arXiv:2507.20534, 2025b. Aili Chen, Aonian Li, Baichuan Zhou, Bangwei Gong, Binyang Jiang, Boji Dan, Ch...

work page internal anchor Pith review Pith/arXiv arXiv

[3] [3]

Autoagent: Evolving cognition and elastic memory orchestration for adaptive agents.arXiv preprint arXiv:2603.09716,

Xiaoxing Wang, Ning Liao, Shikun Wei, Chen Tang, and Feiyu Xiong. Autoagent: Evolving cognition and elastic memory orchestration for adaptive agents.arXiv preprint arXiv:2603.09716,

work page arXiv

[4] [4]

WebGPT: Browser-assisted question-answering with human feedback

Reiichiro Nakano, Jacob Hilton, Suchir Balaji, Jeff Wu, Long Ouyang, Christina Kim, Christopher Hesse, Shantanu Jain, Vineet Kosaraju, William Saunders, et al. Webgpt: Browser-assisted question-answering with human feedback.arXiv preprint arXiv:2112.09332,

work page internal anchor Pith review Pith/arXiv arXiv

[5] [5]

WebSailor: Navigating Super-human Reasoning for Web Agent

Bowen Jin, Hansi Zeng, Zhenrui Yue, Jinsung Yoon, Sercan O Arik, Dong Wang, Hamed Zamani, and Jiawei Han. Search-r1: Training llms to reason and leverage search engines with reinforcement learning. InSecond Conference on Language Modeling. Kuan Li, Zhongwang Zhang, Huifeng Yin, Liwen Zhang, Litu Ou, Jialong Wu, Wenbiao Yin, Baixuan Li, Zhengwei Tao, Xinyu...

work page internal anchor Pith review Pith/arXiv arXiv

[6] [6]

Memory in the Age of AI Agents

Zeyu Zhang, Quanyu Dai, Xiaohe Bo, Chen Ma, Rui Li, Xu Chen, Jieming Zhu, Zhenhua Dong, and Ji-Rong Wen. A survey on the memory mechanism of large language model-based agents.ACM Transactions on Information Systems, 43(6):1–47, 2025a. Yuyang Hu, Shichun Liu, Yanwei Yue, Guibin Zhang, Boyang Liu, Fangyi Zhu, Jiahang Lin, Honglin Guo, Shihan Dou, Zhiheng Xi...

work page internal anchor Pith review Pith/arXiv arXiv

[7] [7]

Memory for autonomous llm agents: Mechanisms, evaluation, and emerging frontiers.arXiv preprint arXiv:2603.07670, 2026

Pengfei Du. Memory for autonomous llm agents: Mechanisms, evaluation, and emerging frontiers.arXiv preprint arXiv:2603.07670,

work page arXiv

[8] [8]

DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models

Aixin Liu, Aoxue Mei, Bangcai Lin, Bing Xue, Bingxuan Wang, Bingzheng Xu, Bochao Wu, Bowei Zhang, Chaofan Lin, Chen Dong, et al. Deepseek-v3. 2: Pushing the frontier of open large language models.arXiv preprint arXiv:2512.02556,

work page internal anchor Pith review Pith/arXiv arXiv

[9] [9]

GenericAgent: A Token-Efficient Self-Evolving LLM Agent via Contextual Information Density Maximization (V1.0)

Jiaqing Liang, Jinyi Han, Weijia Li, Xinyi Wang, Zhoujia Zhang, Zishang Jiang, Ying Liao, Tingyun Li, Ying Huang, Hao Shen, et al. Genericagent: A token-efficient self-evolving llm agent via contextual information density maximization (v1. 0).arXiv preprint arXiv:2604.17091,

work page internal anchor Pith review Pith/arXiv arXiv

[10] [10]

Mirothinker-1.7 & h1: Towards heavy-duty research agents via verification.arXiv preprint arXiv:2603.15726,

MiroMind Team, S Bai, L Bing, L Lei, R Li, X Li, X Lin, E Min, L Su, B Wang, et al. Mirothinker-1.7 & h1: Towards heavy-duty research agents via verification.arXiv preprint arXiv:2603.15726,

work page arXiv

[11] [11]

ACON: Optimizing Context Compression for Long-horizon LLM Agents

Minki Kang, Wei-Ning Chen, Dongge Han, Huseyin A Inan, Lukas Wutschitz, Yanzhi Chen, Robert Sim, and Saravan Rajmohan. Acon: Optimizing context compression for long-horizon llm agents.arXiv preprint arXiv:2510.00615,

work page internal anchor Pith review Pith/arXiv arXiv

[12] [12]

Resum: Unlocking long-horizon search intelligence via context summarization

Xixi Wu, Kuan Li, Yida Zhao, Liwen Zhang, Litu Ou, Huifeng Yin, Zhongwang Zhang, Xinmiao Yu, Dingchu Zhang, Yong Jiang, et al. Resum: Unlocking long-horizon search intelligence via context summarization. arXiv preprint arXiv:2509.13313, 2025a. Weiwei Sun, Miao Lu, Zhan Ling, Kang Liu, Xuesong Yao, Yiming Yang, and Jiecao Chen. Scaling long-horizon llm age...

work page arXiv

[13] [13]

Memobrain: Executive memory as an agentic brain for reasoning

11 ACE: Pluggable Adaptive Context Elasticizer across AgentsA Preprint Hongjin Qian, Zhao Cao, and Zheng Liu. Memobrain: Executive memory as an agentic brain for reasoning. arXiv preprint arXiv:2601.08079,

work page arXiv

[14] [14]

Mem1: Learning to synergize memory and reasoning for efficient long-horizon agents

Zijian Zhou, Ao Qu, Zhaoxuan Wu, Sunghwan Kim, Alok Prakash, Daniela Rus, Jinhua Zhao, Bryan Kian Hsiang Low, and Paul Pu Liang. Mem1: Learning to synergize memory and reasoning for efficient long-horizon agents. InFirst Workshop on Multi-Turn Interactions in Large Language Models. Guoxin Chen, Zile Qiao, Xuanzhong Chen, Donglei Yu, Haotian Xu, Wayne Xin ...

work page arXiv

[15] [15]

Memory as Action: Autonomous Context Curation for Long-Horizon Agentic Tasks

Yuxiang Zhang, Jiangming Shu, Ye Ma, Xueyuan Lin, Shangxi Wu, and Jitao Sang. Memory as action: Autonomous context curation for long-horizon agentic tasks.arXiv preprint arXiv:2510.12635, 2025b. Rui Ye, Zhongwang Zhang, Kuan Li, Huifeng Yin, Zhengwei Tao, Yida Zhao, Liangcai Su, Liwen Zhang, Zile Qiao, Xinyu Wang, et al. Agentfold: Long-horizon web agents...

work page internal anchor Pith review Pith/arXiv arXiv

[16] [16]

Arc: Active and reflection-driven context management for long-horizon information seeking agents.arXiv preprint arXiv:2601.12030,

Yilun Yao, Shan Huang, Elsie Dai, Zhewen Tan, Zhenyu Duan, Shousheng Jia, Yanbing Jiang, and Tong Yang. Arc: Active and reflection-driven context management for long-horizon information seeking agents.arXiv preprint arXiv:2601.12030,

work page arXiv

[17] [17]

Deepagent: A general reasoning agent with scalable toolsets

Xiaoxi Li, Wenxiang Jiao, Jiarui Jin, Guanting Dong, Jiajie Jin, Yinuo Wang, Hao Wang, Yutao Zhu, Ji-Rong Wen, Yuan Lu, et al. Deepagent: A general reasoning agent with scalable toolsets. InProceedings of the ACM Web Conference 2026, pages 2219–2230, 2026a. Xiaoxi Li, Jiajie Jin, Guanting Dong, Hongjin Qian, Yongkang Wu, Ji-Rong Wen, Yutao Zhu, and Zhiche...

work page arXiv 2026

[18] [18]

Longformer: The Long-Document Transformer

Iz Beltagy, Matthew E Peters, and Arman Cohan. Longformer: The long-document transformer.arXiv preprint arXiv:2004.05150,

work page internal anchor Pith review Pith/arXiv arXiv 2004

[19] [19]

Extending Context Window of Large Language Models via Positional Interpolation

Shouyuan Chen, Sherman Wong, Liangjian Chen, and Yuandong Tian. Extending context window of large language models via positional interpolation.arXiv preprint arXiv:2306.15595,

work page internal anchor Pith review Pith/arXiv arXiv

[20] [20]

Yarn: Efficient context window extension of large language models

Bowen Peng, Jeffrey Quesnelle, Honglu Fan, and Enrico Shippole. Yarn: Efficient context window extension of large language models. InInternational Conference on Learning Representations, volume 2024, pages 31932–31951,

2024

[21] [21]

LLMs Get Lost In Multi-Turn Conversation

Philippe Laban, Hiroaki Hayashi, Yingbo Zhou, and Jennifer Neville. Llms get lost in multi-turn conversation. arXiv preprint arXiv:2505.06120,

work page internal anchor Pith review Pith/arXiv arXiv

[22] [22]

Gaia: a benchmark for general ai assistants

Grégoire Mialon, Clémentine Fourrier, Thomas Wolf, Yann LeCun, and Thomas Scialom. Gaia: a benchmark for general ai assistants. InInternational Conference on Learning Representations, volume 2024, pages 9025–9049,

2024

[23] [23]

Humanity's Last Exam

12 ACE: Pluggable Adaptive Context Elasticizer across AgentsA Preprint Long Phan, Alice Gatti, Ziwen Han, Nathaniel Li, Josephina Hu, Hugh Zhang, Chen Bo Calvin Zhang, Mohamed Shaaban, John Ling, Sean Shi, et al. Humanity’s last exam.arXiv preprint arXiv:2501.14249,

work page internal anchor Pith review Pith/arXiv arXiv

[24] [24]

Webwalker: Benchmarking llms in web traversal

Jialong Wu, Wenbiao Yin, Yong Jiang, Zhenglin Wang, Zekun Xi, Runnan Fang, Linhai Zhang, Yulan He, Deyu Zhou, Pengjun Xie, et al. Webwalker: Benchmarking llms in web traversal. InProceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 10290–10305, 2025b. Kaiyuan Chen, Yixin Ren, Yang Liu, Xiao...

work page arXiv