Still is an amortized per-layer Perceiver that synthesizes compact KV caches in one forward pass, outperforming selection and per-context baselines on RULER, HELMET, and LongBench at 8-200x compression.
Compactor: Calibrated Query-Agnostic
4 Pith papers cite this work. Polarity classification is still indexing.
years
2026 4verdicts
UNVERDICTED 4representative citing papers
CAS trains composable per-document KV cache cartridges via dynamic distractor mixing and a rotating budget manager, scaling to million-token collections with 10-31 point gains over monolithic cartridges and matching RAG at 3-4x lower token cost.
VaSE improves KV cache eviction accuracy for reasoning models by over 4% versus prior eviction methods at 4x compression through value-magnitude protection and stochastic diversity.
Document LoRA acts as decoding-time parametric memory that recovers 13-21 ROUGE-L points under heavy KV cache compression in QA, performing best when the base model encodes the document and the adapter is used only at generation with QA supervision.
citing papers explorer
-
Still: Amortized KV Cache Compaction in a Single Forward Pass
Still is an amortized per-layer Perceiver that synthesizes compact KV caches in one forward pass, outperforming selection and per-context baselines on RULER, HELMET, and LongBench at 8-200x compression.
-
Value-Aware Stochastic KV Cache Eviction for Reasoning Models
VaSE improves KV cache eviction accuracy for reasoning models by over 4% versus prior eviction methods at 4x compression through value-magnitude protection and stochastic diversity.