PlanRAG models exploratory reasoning problems as logical query trees, uses dynamic programming with a cost model to build them, and executes iterative retrieval-generation over the trees, outperforming prior RAG methods on the new WikiWeb-ERP dataset.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
years
2026 2verdicts
UNVERDICTED 2representative citing papers
MosaicKV achieves up to 16x attention speedup, 4.8x lower decode latency, 7.3x higher throughput, and 3x memory reduction with 1.76% accuracy loss via dynamic two-D KV cache compression and management on H800 GPUs.
citing papers explorer
-
When RAG Meets Query Planning: Logical Query Trees for Resolving Exploratory Reasoning Problems
PlanRAG models exploratory reasoning problems as logical query trees, uses dynamic programming with a cost model to build them, and executes iterative retrieval-generation over the trees, outperforming prior RAG methods on the new WikiWeb-ERP dataset.
-
MosaicKV: Serving Long-Context LLM with Dynamic Two-D KV Cache Compression
MosaicKV achieves up to 16x attention speedup, 4.8x lower decode latency, 7.3x higher throughput, and 3x memory reduction with 1.76% accuracy loss via dynamic two-D KV cache compression and management on H800 GPUs.