Title resolution pending

Haoyang Li, Zhanchao Xu, Yiming Li, Xuejia Chen, Darian Li, Anxin Tian, Qingfa Xiao, Cheng Deng, Jun Wang, Qing Li, et al · 2025 · arXiv 2507.13681

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

read on arXiv browse 2 citing papers

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

representative citing papers

From Rigid to Dynamic: Entropy-Guided Adaptive Inference for Long-Context LLMs

cs.AI · 2026-06-08 · unverdicted · novelty 6.0

EntropyInfer adaptively allocates inference compute using per-head attention entropy for rigid/dynamic classification during prefilling and compresses KV cache with generated tokens, achieving up to 2.39x speedup on long contexts.

Multi-Segment Attention: Enabling Efficient KV-Cache Management for Faster Large Language Model Serving

cs.AR · 2026-06-01 · unverdicted · novelty 5.0

AsymCache combines Multi-Segment Attention, position-aware eviction, and adaptive chunking to cut TTFT by up to 2.03x and TPOT by up to 1.71x versus recent baselines in LLM serving.

citing papers explorer

Showing 1 of 1 citing paper after filters.

From Rigid to Dynamic: Entropy-Guided Adaptive Inference for Long-Context LLMs cs.AI · 2026-06-08 · unverdicted · none · ref 22
EntropyInfer adaptively allocates inference compute using per-head attention entropy for rigid/dynamic classification during prefilling and compresses KV cache with generated tokens, achieving up to 2.39x speedup on long contexts.

Title resolution pending

fields

years

verdicts

representative citing papers

citing papers explorer