The paper introduces an outlier-injection attack that induces targeted weight collapse in LLMs under advanced quantization schemes including AWQ, GPTQ, and GGUF I-quants.
Sinq: Sinkhorn-normalized quantization for calibration-free low-precision llm weights.arXiv preprint arXiv:2509.22944, 2025
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.LG 2years
2026 2representative citing papers
KVarN uses Hadamard rotation plus dual-axis variance normalization on K and V matrices to cut token-scale errors and error accumulation in KV-cache quantization, reaching new SOTA at 2-bit on MATH500, AIME24 and HumanEval.
citing papers explorer
-
Widening the Gap: Exploiting LLM Quantization via Outlier Injection
The paper introduces an outlier-injection attack that induces targeted weight collapse in LLMs under advanced quantization schemes including AWQ, GPTQ, and GGUF I-quants.
-
KVarN: Variance-Normalized KV-Cache Quantization Mitigates Error Accumulation in Reasoning Tasks
KVarN uses Hadamard rotation plus dual-axis variance normalization on K and V matrices to cut token-scale errors and error accumulation in KV-cache quantization, reaching new SOTA at 2-bit on MATH500, AIME24 and HumanEval.