SAGE-PTQ is a graph-guided ultra-low-bit PTQ framework that achieves 1.03 average weight bits and 0.004 scaling bits per matrix on LLMs while reporting lower perplexity and memory use than BiLLM and PB-LLM.
Post-training quantization for vision transformer.arXiv preprint arXiv:2208.13555, 2022
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.AI 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Minimizing the Hidden Cost of Scales: Graph-Guided Ultra-Low-Bit Quantization for Large Language Models
SAGE-PTQ is a graph-guided ultra-low-bit PTQ framework that achieves 1.03 average weight bits and 0.004 scaling bits per matrix on LLMs while reporting lower perplexity and memory use than BiLLM and PB-LLM.