CAS trains composable per-document KV cache cartridges via dynamic distractor mixing and a rotating budget manager, scaling to million-token collections with 10-31 point gains over monolithic cartridges and matching RAG at 3-4x lower token cost.
InProceedings of the 58th Annual Meeting of the Association for Compu- tational Linguistics, pages 1269–1278, Online
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
VaSE improves KV cache eviction accuracy for reasoning models by over 4% versus prior eviction methods at 4x compression through value-magnitude protection and stochastic diversity.
citing papers explorer
-
Cartridges at Scale: Training Modular KV Caches over Large Document Collections
CAS trains composable per-document KV cache cartridges via dynamic distractor mixing and a rotating budget manager, scaling to million-token collections with 10-31 point gains over monolithic cartridges and matching RAG at 3-4x lower token cost.
-
Value-Aware Stochastic KV Cache Eviction for Reasoning Models
VaSE improves KV cache eviction accuracy for reasoning models by over 4% versus prior eviction methods at 4x compression through value-magnitude protection and stochastic diversity.