TASA improves task-aware mixed-precision LLM quantization by searching calibration data mixtures via gradient-trace alignment and aggregating perplexity plus reasoning sensitivity signals, enabling 3.5-bit models to match or beat 4-bit baselines with over 20-point gains on GSM8K.
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2026 1verdicts
CONDITIONAL 1representative citing papers
citing papers explorer
-
Beyond Activation Alignment:The Alignment-Diversity Tradeoff in Task-Aware LLM Quantization
TASA improves task-aware mixed-precision LLM quantization by searching calibration data mixtures via gradient-trace alignment and aggregating perplexity plus reasoning sensitivity signals, enabling 3.5-bit models to match or beat 4-bit baselines with over 20-point gains on GSM8K.