Implicit Regularization in Deep Learning

Behnam Neyshabur

arxiv: 1709.01953 · v2 · pith:PKDOZCVJnew · submitted 2017-09-06 · 💻 cs.LG

Implicit Regularization in Deep Learning

Behnam Neyshabur This is my paper

classification 💻 cs.LG

keywords learningdeepmeasurescomplexitygeneralizationoptimizationalgorithmsdifferent

0 comments

read the original abstract

In an attempt to better understand generalization in deep learning, we study several possible explanations. We show that implicit regularization induced by the optimization method is playing a key role in generalization and success of deep learning models. Motivated by this view, we study how different complexity measures can ensure generalization and explain how optimization algorithms can implicitly regularize complexity measures. We empirically investigate the ability of these measures to explain different observed phenomena in deep learning. We further study the invariances in neural networks, suggest complexity measures and optimization algorithms that have similar invariances to those in neural networks and evaluate them on a number of learning tasks.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Stationary Robust Mean-Field Games under Model Mismatches
cs.LG 2026-06 unverdicted novelty 6.0

Develops infinite-horizon stationary robust mean-field games incorporating distributional uncertainty, proves equilibrium existence via fixed-point on contractive Bellman operator, gives convergent algorithm, and deri...
Reducing Class Bias In Data-Balanced Datasets Through Hardness-Based Resampling
cs.LG 2025-04 unverdicted novelty 6.0

Hardness-Based Resampling reduces class recall gaps in balanced datasets by up to 32% on CIFAR-10 and 16% on CIFAR-100 by prioritizing harder samples over random or frequency-based selection.
Neural Architectures as Functional Priors in Physics-Informed Control Problems
math.NA 2026-06 unverdicted novelty 5.0

Different neural architectures produce qualitatively distinct controls in PINN optimal control for RLC and Duffing systems, with Fourier versions yielding richer oscillations and smoother nets yielding more regular ef...