A Downsampled Variant of ImageNet as an Alternative to the CIFAR datasets

Patryk Chrabaszcz , Ilya Loshchilov , Frank Hutter

Authors on Pith no claims yet

classification 💻 cs.CV cs.LG

keywords downsampledimagenettimesdatasetsoriginalvariantscifardataset

read the original abstract

The original ImageNet dataset is a popular large-scale benchmark for training Deep Neural Networks. Since the cost of performing experiments (e.g, algorithm design, architecture search, and hyperparameter tuning) on the original dataset might be prohibitive, we propose to consider a downsampled version of ImageNet. In contrast to the CIFAR datasets and earlier downsampled versions of ImageNet, our proposed ImageNet32$\times$32 (and its variants ImageNet64$\times$64 and ImageNet16$\times$16) contains exactly the same number of classes and images as ImageNet, with the only difference that the images are downsampled to 32$\times$32 pixels per image (64$\times$64 and 16$\times$16 pixels for the variants, respectively). Experiments on these downsampled variants are dramatically faster than on the original ImageNet and the characteristics of the downsampled datasets with respect to optimal hyperparameters appear to remain similar. The proposed datasets and scripts to reproduce our results are available at http://image-net.org/download-images and https://github.com/PatrykChrabaszcz/Imagenet32_Scripts

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 11 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Building Normalizing Flows with Stochastic Interpolants
cs.LG 2022-09 conditional novelty 8.0

Normalizing flows are constructed by learning the velocity of a stochastic interpolant via a quadratic loss derived from its probability current, yielding an efficient ODE-based alternative to diffusion models.
Coreset-Induced Conditional Velocity Flow Matching
stat.ML 2026-05 unverdicted novelty 7.0

CCVFM replaces the inner noise source in hierarchical rectified flow matching with a data-informed Gaussian mixture surrogate from a Sinkhorn coreset, yielding a closed-form conditional velocity law and competitive fe...
Zero-Shot Neural Network Evaluation with Sample-Wise Activation Patterns
cs.LG 2026-05 unverdicted novelty 7.0

SWAP-Score evaluates neural networks without training by quantifying sample-wise activation patterns, achieving high correlation with true performance on CIFAR-10 for CNNs and GLUE for Transformers while enabling fast NAS.
Scaling Laws for Autoregressive Generative Modeling
cs.LG 2020-10 accept novelty 7.0

Autoregressive transformers follow power-law scaling laws for cross-entropy loss with nearly universal exponents relating optimal model size to compute budget across four domains.
LLM as a Tool, Not an Agent: Code-Mined Tree Transformations for Neural Architecture Search
cs.LG 2026-04 unverdicted novelty 6.0

LLMasTool improves neural architecture search by evolving code-mined hierarchical trees with diversity-guided Bayesian planning and targeted LLM assistance, reporting gains of 0.69, 1.83, and 2.68 points on CIFAR-10, ...
SubFLOT: Submodel Extraction for Efficient and Personalized Federated Learning via Optimal Transport
cs.LG 2026-04 unverdicted novelty 6.0

SubFLOT uses optimal transport to generate data-aware personalized submodels via server-side pruning and scaling-based adaptive regularization to mitigate parametric divergence in heterogeneous federated learning.
Language Models (Mostly) Know What They Know
cs.CL 2022-07 unverdicted novelty 6.0

Language models show good calibration when asked to estimate the probability that their own answers are correct, with performance improving as models get larger.
A General Language Assistant as a Laboratory for Alignment
cs.CL 2021-12 conditional novelty 6.0

Ranked preference modeling outperforms imitation learning for language model alignment and scales more favorably with model size.
Taming the Long Tail: Rebalancing Adversarial Training via Adaptive Perturbation
cs.LG 2026-05 unverdicted novelty 5.0

RobustLT adaptively adjusts perturbations in adversarial training to simultaneously improve robustness and class balance on long-tailed datasets.
Deterministic Decomposition of Stochastic Generative Dynamics
cs.LG 2026-05 unverdicted novelty 5.0

Stochastic generative dynamics admit a transport-osmotic decomposition of the deterministic field, supporting Bridge Matching for interpretable and tunable generation.
Elucidating the SNR-t Bias of Diffusion Probabilistic Models
cs.CV 2026-04 unverdicted novelty 4.0

Diffusion models have an SNR-timestep mismatch during inference that the authors mitigate with per-frequency differential correction, raising generation quality across IDDPM, ADM, DDIM and others.