Muon achieves higher storage capacity than SGD and matches Newton's method in one-step recovery rates for associative memory under power-law distributions, while saturating at larger critical batch sizes and showing faster initial multi-step dynamics.
The smallest singular value of a random rectangular matrix
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
abstract
We prove an optimal estimate on the smallest singular value of a random subgaussian matrix, valid for all fixed dimensions. For an N by n matrix A with independent and identically distributed subgaussian entries, the smallest singular value of A is at least of the order \sqrt{N} - \sqrt{n-1} with high probability. A sharp estimate on the probability is also obtained.
fields
cs.LG 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Sharp Capacity Scaling of Spectral Optimizers in Learning Associative Memory
Muon achieves higher storage capacity than SGD and matches Newton's method in one-step recovery rates for associative memory under power-law distributions, while saturating at larger critical batch sizes and showing faster initial multi-step dynamics.