Fr-spec: Accelerating large-vocabulary language models via frequency-ranked speculative sampling

Weilin Zhao, Tengyu Pan, Xu Han, Yudi Zhang, Ao Sun, Yuxiang Huang, Kaihuo Zhang, Weilun Zhao, Yuxuan Li, Jianyong Wang, Zhiyuan Liu, Maosong Sun · 2025 · arXiv 2502.14856

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

read on arXiv browse 1 citing papers

representative citing papers

SlimSpec: Low-Rank Draft LM-Head for Accelerated Speculative Decoding

cs.LG · 2026-05-11 · unverdicted · novelty 7.0

SlimSpec replaces the standard LM-head in draft models with a low-rank version to deliver 4-5x faster speculative decoding while preserving full vocabulary and competitive acceptance rates.

citing papers explorer

Showing 1 of 1 citing paper.

SlimSpec: Low-Rank Draft LM-Head for Accelerated Speculative Decoding cs.LG · 2026-05-11 · unverdicted · none · ref 11
SlimSpec replaces the standard LM-head in draft models with a low-rank version to deliver 4-5x faster speculative decoding while preserving full vocabulary and competitive acceptance rates.

Fr-spec: Accelerating large-vocabulary language models via frequency-ranked speculative sampling

fields

years

verdicts

representative citing papers

citing papers explorer