Sherry: Hardware-efficient 1.25-bit ternary quantization via fine-grained sparsification

Hong Huang, Decheng Wu, Qiangqiang Hu, Guanghua Yu, Jinhai Yang, Jianchen Zhu, Xue Liu, Dapeng Wu · 2026 · arXiv 2601.07892

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

read on arXiv browse 2 citing papers

representative citing papers

DuQuant++: Fine-grained Rotation Enhances Microscaling FP4 Quantization

cs.CV · 2026-04-20 · unverdicted · novelty 4.0

DuQuant++ adapts outlier-aware fine-grained rotation to MXFP4 by matching block size to the 32-element microscaling group, enabling a single rotation that smooths distributions and achieves SOTA performance on LLaMA-3 with lower cost.

Hy-MT2: A Family of Fast, Efficient and Powerful Multilingual Translation Models in the Wild

cs.CL · 2026-05-21 · unverdicted · novelty 3.0 · 2 refs

Hy-MT2 presents three new multilingual translation models that claim to outperform listed open-source and commercial systems on diverse tasks while enabling low-storage on-device use.

citing papers explorer

Showing 0 of 0 citing papers after filters.

No citing papers match the current filters.

Sherry: Hardware-efficient 1.25-bit ternary quantization via fine-grained sparsification

fields

years

verdicts

representative citing papers

citing papers explorer