pith:DXFDO6JM
SplitZip: Ultra Fast Lossless KV Compression for Disaggregated LLM Serving
SplitZip achieves over 600 GB/s lossless KV cache compression on GPUs by encoding frequent exponents with fixed-length codes and routing rare ones through a sparse escape stream.
arxiv:2605.01708 v3 · 2026-05-03 · cs.DC · cs.AI · cs.LG
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{DXFDO6JMX7ZMDQ4BVRGCXKJFQB}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
SplitZip achieves 613.3 GB/s compression throughput and 2181.8 GB/s decompression throughput on real BF16 activation tensors, providing up to 1.32× speedup for BF16 KV cache transfer, 1.30× speedup for TTFT, and 1.23× increase on Request Throughput.
The assumption that an offline calibrated top-16 exponent codebook will effectively capture the distribution of exponents in online KV activations during prefill without significant loss in compression ratio or speed, and that the method integrates seamlessly into existing serving frameworks.
SplitZip is a new GPU-friendly lossless compressor for KV cache tensors that exploits exponent redundancy to achieve over 600 GB/s compression throughput and up to 1.32x faster transfers in disaggregated LLM serving.
Formal links
Receipt and verification
| First computed | 2026-06-25T00:18:14.094134Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
1dca37792cbff2c1c381ac4c2ba925805b9af493c0386e681a0230b4b868b621
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/DXFDO6JMX7ZMDQ4BVRGCXKJFQB \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 1dca37792cbff2c1c381ac4c2ba925805b9af493c0386e681a0230b4b868b621
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "f155781a755c59160d66c77c6c53ce1bb7a5bcaa772ab84b0ab7303920158ce4",
"cross_cats_sorted": [
"cs.AI",
"cs.LG"
],
"license": "http://creativecommons.org/licenses/by/4.0/",
"primary_cat": "cs.DC",
"submitted_at": "2026-05-03T04:22:51Z",
"title_canon_sha256": "a600cd2c62402b48a1aef292d2a03cb0951e2a728aac1f0c8a528c9756aebb87"
},
"schema_version": "1.0",
"source": {
"id": "2605.01708",
"kind": "arxiv",
"version": 3
}
}