Training-free context-adaptive attention for efficient long context modeling.CoRR, abs/2512.09238,

Zeng You, Yaofo Chen, Shuhai Zhang, Zhijie Qiu, Tingyu Wu, Yingjian Li, Yaowei Wang, Mingkui Tan · arXiv 2512.09238

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

representative citing papers

NLL-Guided Full-Attention Layer Selection for Training-Free Sliding-Window Adaptation

cs.CL · 2026-06-26 · unverdicted · novelty 7.0

NLL-guided layer selection identifies 1/4 of layers for full attention in hybrid models, matching periodic 1/2-FA baseline accuracy on LongMemEval with Qwen3-4B while halving the full-attention compute budget.

citing papers explorer

Showing 1 of 1 citing paper.

NLL-Guided Full-Attention Layer Selection for Training-Free Sliding-Window Adaptation cs.CL · 2026-06-26 · unverdicted · none · ref 8
NLL-guided layer selection identifies 1/4 of layers for full attention in hybrid models, matching periodic 1/2-FA baseline accuracy on LongMemEval with Qwen3-4B while halving the full-attention compute budget.

Training-free context-adaptive attention for efficient long context modeling.CoRR, abs/2512.09238,

fields

years

verdicts

representative citing papers

citing papers explorer