pith. sign in

hub

Alex Damian, Eshaan Nichani, and Jason D Lee

12 Pith papers cite this work. Polarity classification is still indexing.

12 Pith papers citing it

hub tools

citation-role summary

background 4

citation-polarity summary

roles

background 4

polarities

background 3 unclear 1

clear filters

representative citing papers

Phases of Muon: When Muon Eclipses SignSGD

math.OC · 2026-05-10 · unverdicted · novelty 7.0

On power-law covariance least squares problems, SignSVD (Muon) and SignSGD (Adam proxy) show three phases of relative performance depending on data exponent α and target exponent β.

A Rod Flow Model for Adam at the Edge of Stability

cs.LG · 2026-05-07 · unverdicted · novelty 7.0

Rod flow models for Adam and related optimizers track discrete iterates at the edge of stability more accurately than standard stable flows across tested ML architectures.

Prototype Language Models

cs.LG · 2026-07-01 · unverdicted · novelty 6.0

PRISM forms predictions as sparse mixtures of learned prototypes trained with clustering objectives, matching dense model accuracy while enabling ~500x faster data attribution and behavior editing without finetuning.

Does Weight Decay Enhance Training Stability?

cs.LG · 2026-05-15 · conditional · novelty 6.0

Weight decay slows progressive sharpening at the edge of stability, inducing damped oscillations in CNNs and a phase transition to sub-2/η sharpness in MLPs driven by parameter-sharpness gradient alignment, yielding more stable NTK dynamics.

Muon Learns More Robust and Transferable Features than Adam

cs.LG · 2026-06-08 · unverdicted · novelty 5.0

Muon learns more robust and transferable features than Adam and SGD, shown via corruption robustness tests, transfer experiments, layer-wise probes, effective rank measurements, and a theoretical proof on margins in a multi-component classification problem.

citing papers explorer

Showing 4 of 4 citing papers after filters.