LLMs process negation using both attention-based suppression and constructive representation mechanisms (construction dominant), with late-layer attention shortcuts explaining poor accuracy on negation tasks.
Title resolution pending
3 Pith papers cite this work. Polarity classification is still indexing.
3
Pith papers citing it
fields
cs.CL 3years
2026 3representative citing papers
Diverse language models converge on similar periodic number features with a two-tier hierarchy of Fourier sparsity and geometric separability, acquired via language co-occurrences or multi-token arithmetic.
citing papers explorer
-
How Language Models Process Negation
LLMs process negation using both attention-based suppression and constructive representation mechanisms (construction dominant), with late-layer attention shortcuts explaining poor accuracy on negation tasks.
-
Convergent Evolution: How Different Language Models Learn Similar Number Representations
Diverse language models converge on similar periodic number features with a two-tier hierarchy of Fourier sparsity and geometric separability, acquired via language co-occurrences or multi-token arithmetic.
- Copy First, Translate Later: Interpreting Translation Dynamics in Multilingual Pretraining