Under a polynomial context-truncation sensitivity assumption, suffix-only KV cache policies require per-token memory scaling as Θ(ε^{-1/α}) to achieve distortion ε.
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.IT 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Polynomial Context-Truncation Sensitivity in Autoregressive Language Models: Sequential Wyner-Ziv Bounds for KV Cache Compression
Under a polynomial context-truncation sensitivity assumption, suffix-only KV cache policies require per-token memory scaling as Θ(ε^{-1/α}) to achieve distortion ε.