Grokked transformers are implicit reasoners: A mechanistic journey to the edge of generalization

URL https://arxiv · 2024 · arXiv 2405.15071

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

representative citing papers

Tracing Persona Vectors Through LLM Pretraining

cs.CL · 2026-05-13 · unverdicted · novelty 8.0

Persona vectors form within the first 0.22% of LLM pretraining and remain effective for steering post-trained models, with continued refinement and transfer to other models.

The Long Delay to Arithmetic Generalization: When Learned Representations Outrun Behavior

cs.LG · 2026-03-30 · unverdicted · novelty 7.0

The grokking delay in encoder-decoder models on one-step Collatz prediction stems from decoder inability to use early-learned encoder representations of parity and residue structure, with numeral base acting as a strong inductive bias that can raise accuracy from failure to 99.8%.

Detecting overfitting in Neural Networks during long-horizon grokking using Random Matrix Theory

cs.LG · 2026-05-12 · unverdicted · novelty 6.0

A Random Matrix Theory method identifies growing Correlation Traps in neural network weight spectra during an 'anti-grokking' overfitting phase, and applies the same diagnostic to some foundation LLMs.

The Power of Power Law: Asymmetry Enables Compositional Reasoning

cs.AI · 2026-04-24 · unverdicted · novelty 6.0

Power-law data sampling creates beneficial asymmetry in the loss landscape that lets models acquire high-frequency skill compositions first, enabling more efficient learning of rare long-tail skills than uniform distributions.

Model Capacity Determines Grokking through Competing Memorisation and Generalisation Speeds

cs.LG · 2026-05-10 · unverdicted · novelty 5.0

Grokking emerges near the model size where memorization timescale T_mem(P) intersects generalization timescale T_gen(P) on modular arithmetic.

citing papers explorer

Showing 5 of 5 citing papers.

Tracing Persona Vectors Through LLM Pretraining cs.CL · 2026-05-13 · unverdicted · none · ref 12
Persona vectors form within the first 0.22% of LLM pretraining and remain effective for steering post-trained models, with continued refinement and transfer to other models.
The Long Delay to Arithmetic Generalization: When Learned Representations Outrun Behavior cs.LG · 2026-03-30 · unverdicted · none · ref 29
The grokking delay in encoder-decoder models on one-step Collatz prediction stems from decoder inability to use early-learned encoder representations of parity and residue structure, with numeral base acting as a strong inductive bias that can raise accuracy from failure to 99.8%.
Detecting overfitting in Neural Networks during long-horizon grokking using Random Matrix Theory cs.LG · 2026-05-12 · unverdicted · none · ref 24
A Random Matrix Theory method identifies growing Correlation Traps in neural network weight spectra during an 'anti-grokking' overfitting phase, and applies the same diagnostic to some foundation LLMs.
The Power of Power Law: Asymmetry Enables Compositional Reasoning cs.AI · 2026-04-24 · unverdicted · none · ref 51
Power-law data sampling creates beneficial asymmetry in the loss landscape that lets models acquire high-frequency skill compositions first, enabling more efficient learning of rare long-tail skills than uniform distributions.
Model Capacity Determines Grokking through Competing Memorisation and Generalisation Speeds cs.LG · 2026-05-10 · unverdicted · none · ref 15
Grokking emerges near the model size where memorization timescale T_mem(P) intersects generalization timescale T_gen(P) on modular arithmetic.

Grokked transformers are implicit reasoners: A mechanistic journey to the edge of generalization

fields

years

verdicts

representative citing papers

citing papers explorer