pith. sign in

arXiv preprint arXiv:2102.10882 (2021)

12 Pith papers cite this work. Polarity classification is still indexing.

12 Pith papers citing it

citation-role summary

method 1

citation-polarity summary

roles

method 1

polarities

use method 1

clear filters

representative citing papers

Massive Activations in Large Language Models

cs.CL · 2024-02-27 · unverdicted · novelty 7.0

Massive activations are constant large values in LLMs that function as indispensable bias terms and concentrate attention probabilities on specific tokens.

End-to-End Context Compression at Scale

cs.CL · 2026-06-08 · unverdicted · novelty 6.0

LCLMs are scaled 0.6B-encoder 4B-decoder compressors pre-trained on over 350B tokens that improve the Pareto frontier for general-task performance, compression speed, and peak memory in long-context language model inference.

citing papers explorer

Showing 2 of 2 citing papers after filters.

  • Massive Activations in Large Language Models cs.CL · 2024-02-27 · unverdicted · none · ref 14

    Massive activations are constant large values in LLMs that function as indispensable bias terms and concentrate attention probabilities on specific tokens.

  • End-to-End Context Compression at Scale cs.CL · 2026-06-08 · unverdicted · none · ref 12

    LCLMs are scaled 0.6B-encoder 4B-decoder compressors pre-trained on over 350B tokens that improve the Pareto frontier for general-task performance, compression speed, and peak memory in long-context language model inference.