Reconciling modern machine-learning practice and the classical bias– variance trade-off

· 2019 · arXiv 1812.11118

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

representative citing papers

Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets

cs.LG · 2022-01-06 · unverdicted · novelty 8.0

Neural networks exhibit grokking on small algorithmic datasets, achieving perfect generalization well after overfitting.

Language Models (Mostly) Know What They Know

cs.CL · 2022-07-11 · unverdicted · novelty 6.0

Language models show good calibration when asked to estimate the probability that their own answers are correct, with performance improving as models get larger.

A General Language Assistant as a Laboratory for Alignment

cs.CL · 2021-12-01 · conditional · novelty 6.0

Ranked preference modeling outperforms imitation learning for language model alignment and scales more favorably with model size.

Lecture Notes on Statistical Physics and Neural Networks

cond-mat.dis-nn · 2026-05-07 · unverdicted · novelty 2.0

Lecture notes that treat statistical physics as probability theory and connect Ising models, spin glasses, and renormalization group ideas to Hopfield networks, restricted Boltzmann machines, and large language models.

Scaling Laws for Neural Language Models

cs.LG · 2020-01-23

citing papers explorer

Showing 5 of 5 citing papers.

Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets cs.LG · 2022-01-06 · unverdicted · none · ref 1
Neural networks exhibit grokking on small algorithmic datasets, achieving perfect generalization well after overfitting.
Language Models (Mostly) Know What They Know cs.CL · 2022-07-11 · unverdicted · none · ref 161
Language models show good calibration when asked to estimate the probability that their own answers are correct, with performance improving as models get larger.
A General Language Assistant as a Laboratory for Alignment cs.CL · 2021-12-01 · conditional · none · ref 103
Ranked preference modeling outperforms imitation learning for language model alignment and scales more favorably with model size.
Lecture Notes on Statistical Physics and Neural Networks cond-mat.dis-nn · 2026-05-07 · unverdicted · none · ref 48
Lecture notes that treat statistical physics as probability theory and connect Ising models, spin glasses, and renormalization group ideas to Hopfield networks, restricted Boltzmann machines, and large language models.
Scaling Laws for Neural Language Models cs.LG · 2020-01-23 · unreviewed · ref 2

Reconciling modern machine-learning practice and the classical bias– variance trade-off

fields

years

verdicts

representative citing papers

citing papers explorer