On power-law covariance least squares problems, SignSVD (Muon) and SignSGD (Adam proxy) show three phases of relative performance depending on data exponent α and target exponent β.
High-dimensional limit theorems for SGD: Momentum and Adaptive Step-sizes
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
Dynamic preconditioning preserves the Polyak-Ruppert CLT for averaged SGD if the preconditioner stabilizes at rate β > (α + 1)/2.
citing papers explorer
-
Phases of Muon: When Muon Eclipses SignSGD
On power-law covariance least squares problems, SignSVD (Muon) and SignSGD (Adam proxy) show three phases of relative performance depending on data exponent α and target exponent β.
-
When Does Dynamic Preconditioning Preserve the Polyak-Ruppert CLT? A Stabilization Threshold
Dynamic preconditioning preserves the Polyak-Ruppert CLT for averaged SGD if the preconditioner stabilizes at rate β > (α + 1)/2.