Modular manifolds

Jeremy Bernstein · 2025 · DOI 10.64434/tml.20250926

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open at publisher browse 1 citing papers

representative citing papers

Improving Neural Network Training by Decoupling the Magnitude and Direction of Weight Vectors

cs.LG · 2026-06-24 · unverdicted · novelty 6.0

MD Decoupling factorizes weights into fixed-norm directions and learnable per-row/column magnitudes updated at independent rates, improving Adam and Muon training stability and scale transfer without weight decay or warmup.

citing papers explorer

Showing 1 of 1 citing paper.

Improving Neural Network Training by Decoupling the Magnitude and Direction of Weight Vectors cs.LG · 2026-06-24 · unverdicted · none · ref 8
MD Decoupling factorizes weights into fixed-norm directions and learnable per-row/column magnitudes updated at independent rates, improving Adam and Muon training stability and scale transfer without weight decay or warmup.

Modular manifolds

fields

years

verdicts

representative citing papers

citing papers explorer