Diverse language models converge on similar periodic number features with a two-tier hierarchy of Fourier sparsity and geometric separability, acquired via language co-occurrences or multi-token arithmetic.
Title resolution pending
3 Pith papers cite this work. Polarity classification is still indexing.
years
2026 3verdicts
UNVERDICTED 3representative citing papers
MDN parallelizes stepwise momentum for delta linear attention using geometric reordering and dynamical systems analysis, yielding performance gains over Mamba2 and GDN on 400M and 1.3B models.
Jupiter-N is a post-trained version of Nemotron 3 Super that reports gains on Welsh benchmarks, terminal agent tasks, and instruction following while retaining base capabilities, released openly as a template for sovereign cultural AI adaptation.
citing papers explorer
-
Convergent Evolution: How Different Language Models Learn Similar Number Representations
Diverse language models converge on similar periodic number features with a two-tier hierarchy of Fourier sparsity and geometric separability, acquired via language co-occurrences or multi-token arithmetic.
-
MDN: Parallelizing Stepwise Momentum for Delta Linear Attention
MDN parallelizes stepwise momentum for delta linear attention using geometric reordering and dynamical systems analysis, yielding performance gains over Mamba2 and GDN on 400M and 1.3B models.
-
Jupiter-N Technical Report
Jupiter-N is a post-trained version of Nemotron 3 Super that reports gains on Welsh benchmarks, terminal agent tasks, and instruction following while retaining base capabilities, released openly as a template for sovereign cultural AI adaptation.