Experiments indicate that small-batch SGD promotes flatter loss minima and better generalization in overparameterized networks, and that sparse subnetworks can retain nearly full accuracy.
To understand deep learning we need to understand kernel learning
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Implicit Regularization and Generalization in Overparameterized Neural Networks
Experiments indicate that small-batch SGD promotes flatter loss minima and better generalization in overparameterized networks, and that sparse subnetworks can retain nearly full accuracy.