TWLA is a PTQ method using E2M-ATQ, KOTMS, and ILA-AMP to enable W1.58A4 quantization for LLMs with maintained accuracy.
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
TWLA: Achieving Ternary Weights and Low-Bit Activations for LLMs via Post-Training Quantization
TWLA is a PTQ method using E2M-ATQ, KOTMS, and ILA-AMP to enable W1.58A4 quantization for LLMs with maintained accuracy.