Eagle: Speculative sampling requires rethinking feature uncertainty

Yuhui Li, Fangyun Wei, Chao Zhang, Hongyang Zhang · 2025

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

browse 2 citing papers

representative citing papers

Rethinking Network Topologies for Cost-Effective Mixture-of-Experts LLM Serving

cs.NI · 2026-04-30 · unverdicted · novelty 6.0

Switchless topologies such as 3D full-mesh are 20.6-56.2% more cost-effective than scale-up networks for MoE LLM serving, with current link bandwidths over-provisioned by up to 27%.

STAR: Decode-Phase Rescheduling for LLM Inference

cs.DC · 2025-10-15 · unverdicted · novelty 5.0

STAR cuts P99 TPOT by 75.1% and raises goodput 2.63x via a lightweight hidden-state length predictor and dynamic decode rescheduling that combines current and predicted loads.

citing papers explorer

Showing 2 of 2 citing papers.

Rethinking Network Topologies for Cost-Effective Mixture-of-Experts LLM Serving cs.NI · 2026-04-30 · unverdicted · none · ref 34
Switchless topologies such as 3D full-mesh are 20.6-56.2% more cost-effective than scale-up networks for MoE LLM serving, with current link bandwidths over-provisioned by up to 27%.
STAR: Decode-Phase Rescheduling for LLM Inference cs.DC · 2025-10-15 · unverdicted · none · ref 22
STAR cuts P99 TPOT by 75.1% and raises goodput 2.63x via a lightweight hidden-state length predictor and dynamic decode rescheduling that combines current and predicted loads.

Eagle: Speculative sampling requires rethinking feature uncertainty

fields

years

verdicts

representative citing papers

citing papers explorer