Allowing each quantization group to select among multiple 4-bit grids improves accuracy over single-grid FP4 for both post-training and pre-training of LLMs.
Qtip: Quantization with trellises and incoherence processing.Advances in Neural Information Processing Systems, 37:59597– 59620
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.LG 2years
2026 2representative citing papers
LAQuant improves long-decoding accuracy on quantized reasoning models like Qwen3-4B by 15pp on AIME25 via layer-wise lookahead loss, achieving 3.42x speedup over FP16.
citing papers explorer
-
Grid Games: The Power of Multiple Grids for Quantizing Large Language Models
Allowing each quantization group to select among multiple 4-bit grids improves accuracy over single-grid FP4 for both post-training and pre-training of LLMs.