Less Is More: Elevating RAG via Performance-Driven Context Compression

Bowei He; Chen Ma; Jiamin Chen; Peiyang Liu; Shiwei Li; Xing Tang; Xiuqiang He; Yansen Zhang; Yunpeng Weng; Ziqiang Cui

arxiv: 2508.19282 · v4 · pith:NOURAJGMnew · submitted 2025-08-24 · 💻 cs.CL · cs.AI

Less Is More: Elevating RAG via Performance-Driven Context Compression

Ziqiang Cui , Yunpeng Weng , Xing Tang , Peiyang Liu , Shiwei Li , Bowei He , Jiamin Chen , Yansen Zhang

show 2 more authors

Xiuqiang He Chen Ma

This is my paper

classification 💻 cs.CL cs.AI

keywords compressioncontextheuristicsperformancecompressorcoredocumentsframework

0 comments

read the original abstract

Retrieval-Augmented Generation (RAG) has emerged as a promising paradigm for improving the timeliness of knowledge updates and the factual accuracy of large language models. However, incorporating a large volume of retrieved documents significantly increases input length, leading to prohibitive computational costs. Existing compression approaches often compromise task performance, primarily due to their reliance on predefined heuristics. These heuristics fail to ensure that the compressed context is conducive to the generation tasks. To address these limitations, we propose CORE-RAG, a novel framework for context compression in RAG systems. CORE eliminates reliance on proxy heuristics through a performance-driven learning framework, which directy utilizes task performance as a feedback signal to iteratively refine the compressor policy. Prior to this optimization process, we incorporate a knowledge distillation phase to initialize the compressor with a robust policy. Extensive experiments demonstrate the superiority of our approach. At a high compression ratio of 3%, CORE not only avoids performance degradation but also improves the average Exact Match (EM) score by 3.3 points compared to using full documents. Our code is available at https://github.com/ziqiangcui/CORE-RAG-ICML26.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

From Player to Master: Enhancing Test-Time Learning of LLM Agents via Reinforcement Learning over Memory
cs.CL 2026-06 unverdicted novelty 6.0

MemoPilot trains memory updates for LLM agents via multi-turn GRPO on RPS and poker, achieving top Elo scores and outperforming baselines including DeepSeek-V3.2.