pith. sign in

Tensor programs ivb: Adaptive optimization in the infinite-width limit.CoRR, abs/2308.01814

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

years

2026 2

verdicts

UNVERDICTED 2

representative citing papers

Spectral Condition for $\mu$P under Width-Depth Scaling

cs.LG · 2026-02-28 · unverdicted · novelty 6.0

A unified spectral condition for μP under width-depth scaling reveals a transition at k=1 vs k≥2 transformations per residual block and enables stable feature learning for practical architectures like Transformers.

There Will Be a Scientific Theory of Deep Learning

stat.ML · 2026-04-23 · unverdicted · novelty 2.0

A mechanics of the learning process is emerging in deep learning theory, characterized by dynamics, coarse statistics, and falsifiable predictions across idealized settings, limits, laws, hyperparameters, and universal behaviors.

citing papers explorer

Showing 2 of 2 citing papers.

  • Spectral Condition for $\mu$P under Width-Depth Scaling cs.LG · 2026-02-28 · unverdicted · none · ref 44

    A unified spectral condition for μP under width-depth scaling reveals a transition at k=1 vs k≥2 transformations per residual block and enables stable feature learning for practical architectures like Transformers.

  • There Will Be a Scientific Theory of Deep Learning stat.ML · 2026-04-23 · unverdicted · none · ref 92

    A mechanics of the learning process is emerging in deep learning theory, characterized by dynamics, coarse statistics, and falsifiable predictions across idealized settings, limits, laws, hyperparameters, and universal behaviors.