Do ImageNet Classifiers Generalize to ImageNet?

Benjamin Recht , Rebecca Roelofs , Ludwig Schmidt , Vaishaal Shankar

Authors on Pith no claims yet

classification 💻 cs.CV cs.LGstat.ML

keywords testsetsimagenetaccuracygeneralizemodelsoriginalcifar-10

read the original abstract

We build new test sets for the CIFAR-10 and ImageNet datasets. Both benchmarks have been the focus of intense research for almost a decade, raising the danger of overfitting to excessively re-used test sets. By closely following the original dataset creation processes, we test to what extent current classification models generalize to new data. We evaluate a broad range of models and find accuracy drops of 3% - 15% on CIFAR-10 and 11% - 14% on ImageNet. However, accuracy gains on the original test sets translate to larger gains on the new test sets. Our results suggest that the accuracy drops are not caused by adaptivity, but by the models' inability to generalize to slightly "harder" images than those found in the original test sets.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

LAION-5B: An open large-scale dataset for training next generation image-text models
cs.CV 2022-10 accept novelty 7.0

LAION-5B is an openly released dataset of 5.85 billion CLIP-filtered image-text pairs that enables replication of foundational vision-language models.
Medical Model Synthesis Architectures: A Case Study
cs.AI 2026-05 unverdicted novelty 5.0

MedMSA framework retrieves knowledge via language models then builds formal probabilistic models to produce uncertainty-weighted differential diagnoses from symptoms.
Debunking Grad-ECLIP: A Comprehensive Study on Its Incorrectness and Fundamental Principles for Model Interpretation
cs.CV 2026-05 unverdicted novelty 4.0

Grad-ECLIP is an equivalent but flawed variant of attention-based interpretation, with two principles proposed to ensure model explanations reflect the original model.