Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification

Kaiming He , Xiangyu Zhang , Shaoqing Ren , Jian Sun

Authors on Pith no claims yet

classification 💻 cs.CV cs.AIcs.LG

keywords rectifiedclassificationnetworkspreludeepfirsthuman-levelimagenet

read the original abstract

Rectified activation units (rectifiers) are essential for state-of-the-art neural networks. In this work, we study rectifier neural networks for image classification from two aspects. First, we propose a Parametric Rectified Linear Unit (PReLU) that generalizes the traditional rectified unit. PReLU improves model fitting with nearly zero extra computational cost and little overfitting risk. Second, we derive a robust initialization method that particularly considers the rectifier nonlinearities. This method enables us to train extremely deep rectified models directly from scratch and to investigate deeper or wider network architectures. Based on our PReLU networks (PReLU-nets), we achieve 4.94% top-5 test error on the ImageNet 2012 classification dataset. This is a 26% relative improvement over the ILSVRC 2014 winner (GoogLeNet, 6.66%). To our knowledge, our result is the first to surpass human-level performance (5.1%, Russakovsky et al.) on this visual recognition challenge.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 12 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

U-Net: Convolutional Networks for Biomedical Image Segmentation
cs.CV 2015-05 accept novelty 8.0

A u-shaped fully-convolutional encoder-decoder with skip connections trained with elastic-deformation augmentation produces accurate biomedical image segmentations from very small training sets.
Determining star formation histories and age-metallicity relations with convolutional neural networks
astro-ph.GA 2026-05 unverdicted novelty 7.0

A CNN with attention and shared latent space recovers SFHs and metallicities from spectro-photometric data with ~0.12 dex age and ~0.03 dex metallicity dispersion while running thousands of times faster than full spec...
Criticality and Saturation in Orthogonal Neural Networks
cs.LG 2026-05 conditional novelty 7.0

Derives layer-wise recursions for finite-width tensors under orthogonal initialization that reproduce the observed large-depth stability of nonlinear networks.
A Theory of Saddle Escape in Deep Nonlinear Networks
cs.LG 2026-05 unverdicted novelty 7.0

Derives exact norm-imbalance identity for deep nonlinear nets, classifying activations into four classes and yielding escape time law τ★ = Θ(ε^{-(r-2)}) governed by bottleneck depth r.
A Theory of Saddle Escape in Deep Nonlinear Networks
cs.LG 2026-05 conditional novelty 7.0

An exact norm-imbalance identity classifies activations into four classes and reduces deep nonlinear training flow to a scalar ODE that predicts saddle escape time scaling as ε to the power of minus (r-2) for r bottle...
Isotropic Activation Functions Enable Deindividuated Neurons and Adaptive Topologies
cs.NE 2026-02 unverdicted novelty 7.0

Isotropic activation functions derived from reparameterisation symmetries and SVD diagonalisation enable function-preserving neuron removal and addition in dense networks, supporting up to 50% sparsification and real-...
Progressive Growing of GANs for Improved Quality, Stability, and Variation
cs.NE 2017-10 accept novelty 7.0

Progressive growing stabilizes GAN training to produce high-resolution images of unprecedented quality and achieves a record unsupervised inception score of 8.80 on CIFAR10.
Wide Residual Networks
cs.CV 2016-05 accept novelty 7.0

Wide residual networks achieve higher accuracy and faster training than very deep thin residual networks by increasing width and decreasing depth, setting new state-of-the-art results on CIFAR, SVHN, and ImageNet.
LSUN: Construction of a Large-scale Image Dataset using Deep Learning with Humans in the Loop
cs.CV 2015-06 accept novelty 7.0

LSUN dataset of one million images per category across 30 classes is constructed via iterative human-in-the-loop deep learning labeling.
Similarity-Based Bike Station Expansion via Hybrid Denoising Autoencoders
cs.LG 2026-04 unverdicted novelty 6.0

A hybrid denoising autoencoder with supervised head learns latent urban features to select bike station expansion candidates via latent-space similarity, producing 32 consensus high-confidence zones in Trondheim.
MONAI: An open-source framework for deep learning in healthcare
cs.LG 2022-11 accept novelty 6.0

MONAI is a community-supported PyTorch framework that extends deep learning to medical data with domain-specific architectures, transforms, and deployment tools.
Toeplitz MLP Mixers are Low Complexity, Information-Rich Sequence Models
cs.LG 2026-04 unverdicted novelty 5.0

Toeplitz MLP Mixers replace attention with masked Toeplitz multiplications for sub-quadratic complexity while retaining more sequence information and outperforming on copying and in-context tasks.