Title resolution pending

Tri Dao, Albert Gu , booktitle= · 2024

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

browse 3 citing papers

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

representative citing papers

Convergent Evolution: How Different Language Models Learn Similar Number Representations

cs.CL · 2026-04-22 · unverdicted · novelty 6.0

Diverse language models converge on similar periodic number features with a two-tier hierarchy of Fourier sparsity and geometric separability, acquired via language co-occurrences or multi-token arithmetic.

MDN: Parallelizing Stepwise Momentum for Delta Linear Attention

cs.LG · 2026-05-07 · unverdicted · novelty 5.0

MDN parallelizes stepwise momentum for delta linear attention using geometric reordering and dynamical systems analysis, yielding performance gains over Mamba2 and GDN on 400M and 1.3B models.

Jupiter-N Technical Report

cs.CL · 2026-04-19 · unverdicted · novelty 5.0

Jupiter-N is a post-trained version of Nemotron 3 Super that reports gains on Welsh benchmarks, terminal agent tasks, and instruction following while retaining base capabilities, released openly as a template for sovereign cultural AI adaptation.

citing papers explorer

Showing 3 of 3 citing papers.

Convergent Evolution: How Different Language Models Learn Similar Number Representations cs.CL · 2026-04-22 · unverdicted · none · ref 49
Diverse language models converge on similar periodic number features with a two-tier hierarchy of Fourier sparsity and geometric separability, acquired via language co-occurrences or multi-token arithmetic.
MDN: Parallelizing Stepwise Momentum for Delta Linear Attention cs.LG · 2026-05-07 · unverdicted · none · ref 7
MDN parallelizes stepwise momentum for delta linear attention using geometric reordering and dynamical systems analysis, yielding performance gains over Mamba2 and GDN on 400M and 1.3B models.
Jupiter-N Technical Report cs.CL · 2026-04-19 · unverdicted · none · ref 19
Jupiter-N is a post-trained version of Nemotron 3 Super that reports gains on Welsh benchmarks, terminal agent tasks, and instruction following while retaining base capabilities, released openly as a template for sovereign cultural AI adaptation.

Title resolution pending

fields

years

verdicts

representative citing papers

citing papers explorer