GRACE reframes KV cache channel pruning as graph optimization to find a near-optimal subset, achieving 60% compression with negligible degradation and outperforming prior methods.
Hugging- face’s transformers: State-of-the-art natural language processing
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
eess.SP 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Graph-Guided Adaptive Channel Elimination for KV Cache Compression
GRACE reframes KV cache channel pruning as graph optimization to find a near-optimal subset, achieving 60% compression with negligible degradation and outperforming prior methods.