2025.PolyServe: Efficient Multi-SLO Serving at Scale

Kan Zhu, Haiyang Shi, Le Xu, Jiaxin Shan, Arvind Krishnamurthy, Baris Kasikci, Liguang Xie · 2025 · arXiv 2507.17769

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

representative citing papers

Beyond Greedy Chunking: SLO-Aware Sliding-Window Scheduling for LLM Inference

cs.DC · 2026-06-04 · unverdicted · novelty 6.0

SlidingServe achieves up to 30% higher service capacity and 16-53% fewer SLO violations in LLM inference by using dynamic chunking and priority-based batch construction.

citing papers explorer

Showing 1 of 1 citing paper.

Beyond Greedy Chunking: SLO-Aware Sliding-Window Scheduling for LLM Inference cs.DC · 2026-06-04 · unverdicted · none · ref 30
SlidingServe achieves up to 30% higher service capacity and 16-53% fewer SLO violations in LLM inference by using dynamic chunking and priority-based batch construction.

2025.PolyServe: Efficient Multi-SLO Serving at Scale

fields

years

verdicts

representative citing papers

citing papers explorer