The Thirty Sixth Annual Conference on Learning Theory , pages=

Sgd learning on neural networks: leap complexity, saddle-to-saddle dynamics , author= · 2023

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

browse 2 citing papers

representative citing papers

Scaling Laws from Sequential Feature Recovery: A Solvable Hierarchical Model

stat.ML · 2026-05-14 · accept · novelty 7.0

A solvable hierarchical model with power-law feature strengths yields explicit power-law scaling of prediction error through sequential recovery of latent directions by a layer-wise spectral algorithm.

Arithmetic in the Wild: Llama uses Base-10 Addition to Reason About Cyclic Concepts

cs.AI · 2026-05-01 · unverdicted · novelty 7.0

Llama-3.1-8B computes sums for cyclic concepts using base-10 addition via task-agnostic Fourier features with periods 2, 5, and 10 rather than modular arithmetic in the concept period.

citing papers explorer

Showing 2 of 2 citing papers.

Scaling Laws from Sequential Feature Recovery: A Solvable Hierarchical Model stat.ML · 2026-05-14 · accept · none · ref 150
A solvable hierarchical model with power-law feature strengths yields explicit power-law scaling of prediction error through sequential recovery of latent directions by a layer-wise spectral algorithm.
Arithmetic in the Wild: Llama uses Base-10 Addition to Reason About Cyclic Concepts cs.AI · 2026-05-01 · unverdicted · none · ref 4
Llama-3.1-8B computes sums for cyclic concepts using base-10 addition via task-agnostic Fourier features with periods 2, 5, and 10 rather than modular arithmetic in the concept period.

The Thirty Sixth Annual Conference on Learning Theory , pages=

fields

years

verdicts

representative citing papers

citing papers explorer