hub

Empirical evalua- tion of rectified activations in convolutional network

Bing Xu, Naiyan Wang, Tianqi Chen, Mu Li · 2015 · cs.LG · arXiv 1505.00853

18 Pith papers cite this work. Polarity classification is still indexing.

18 Pith papers citing it

open full Pith review browse 18 citing papers arXiv PDF

abstract

In this paper we investigate the performance of different types of rectified activation functions in convolutional neural network: standard rectified linear unit (ReLU), leaky rectified linear unit (Leaky ReLU), parametric rectified linear unit (PReLU) and a new randomized leaky rectified linear units (RReLU). We evaluate these activation function on standard image classification task. Our experiments suggest that incorporating a non-zero slope for negative part in rectified activation units could consistently improve the results. Thus our findings are negative on the common belief that sparsity is the key of good performance in ReLU. Moreover, on small scale dataset, using deterministic negative slope or learning it are both prone to overfitting. They are not as effective as using their randomized counterpart. By using RReLU, we achieved 75.68\% accuracy on CIFAR-100 test set without multiple test or ensemble.

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks

cs.LG · 2015-11-19 · accept · novelty 8.0

DCGANs with architectural constraints learn a hierarchy of representations from object parts to scenes in both generator and discriminator across image datasets.

Score-based Greedy Search for Structure Identification of Partially Observed Linear Causal Models

cs.LG · 2025-10-05 · unverdicted · novelty 7.0

Introduces Generalized N Factor Model and LGES algorithm that identifies true causal structure including latents up to Markov equivalence class via score-based greedy search.

Locally Near Optimal Piecewise Linear Regression in High Dimensions via Difference of Max-Affine Functions

stat.ML · 2026-05-07 · unverdicted · novelty 7.0

ABGD parametrizes piecewise linear functions as difference of max-affine functions and converges linearly to an epsilon-accurate solution with O(d max(sigma/epsilon,1)^2) samples under sub-Gaussian noise, which is minimax optimal up to logs.

Searching for Activation Functions

cs.NE · 2017-10-16 · conditional · novelty 7.0

Automated search discovers Swish activation f(x) = x * sigmoid(βx) that improves top-1 ImageNet accuracy over ReLU by 0.9% on Mobile NASNet-A and 0.6% on Inception-ResNet-v2.

On Divergence Measures for Training GFlowNets

cs.LG · 2024-10-12 · unverdicted · novelty 6.0

Introduces statistically efficient estimators for Renyi-α, Tsallis-α, reverse and forward KL divergences with REINFORCE and score-matching control variates for faster GFlowNet training.

Improving Description-based Person Re-identification by Multi-granularity Image-text Alignments

cs.CV · 2019-06-23 · unverdicted · novelty 6.0

The MIA model with GC, RGA, and BFM modules achieves state-of-the-art performance on the CUHK-PEDES dataset for description-based person re-identification.

Materialistic RIR: Material Conditioned Realistic RIR Generation

cs.CV · 2026-04-22 · unverdicted · novelty 6.0

A two-module neural model disentangles spatial layout from material properties to generate controllable and more realistic room impulse responses, reporting gains of up to 16% on acoustic metrics and 70% on material metrics plus better human ratings.

cs.LG · 2026-04-04 · unverdicted · novelty 6.0

A functional similarity metric for ReLU networks uses normalized activation region signatures and MinHash to overcome parametric symmetries like neuron permutation and scaling.

Activation Function Design Sustains Plasticity in Continual Learning

cs.LG · 2025-09-26 · unverdicted · novelty 5.0

Smooth-Leaky and Randomized Smooth-Leaky activations mitigate loss of plasticity in continual learning by targeting negative-branch shape and saturation behavior.

Gamma-Ray Burst Light Curve Reconstruction: A Comparative Machine and Deep Learning Analysis

astro-ph.HE · 2024-12-28 · unverdicted · novelty 5.0

MLP and Attention U-Net outperform other models in reconstructing GRB light curves on 521 events, cutting plateau parameter uncertainties by 37-41% versus the Willingale baseline while achieving low MSE.

High-throughput Onboard Hyperspectral Image Compression with Ground-based CNN Reconstruction

eess.IV · 2019-07-05 · unverdicted · novelty 5.0

Prequantization-based lossless predictive compression onboard hyperspectral images with CNN ground reconstruction recovers the entire SNR drop at 2 bpp.

Sparsity Hurts: Simple Linear Adapter Can Boost Generalized Category Discovery

cs.CV · 2026-05-05 · unverdicted · novelty 5.0

LAGCD inserts residual linear adapters into each ViT block plus a distribution alignment loss to improve generalized category discovery by increasing model flexibility while reducing bias between seen and novel classes.

Adaptive Reorganization of Neural Pathways for Continual Learning with Spiking Neural Networks

cs.NE · 2023-09-18 · unverdicted · novelty 4.0

SOR-SNN employs Self-Organizing Regulation networks to reorganize a single SNN into sparse pathways, achieving better performance, energy efficiency, memory use, backward transfer, and self-repair on continual learning tasks including CIFAR100 and ImageNet.

Discriminative Embedding Autoencoder with a Regressor Feedback for Zero-Shot Learning

cs.CV · 2019-07-18 · unverdicted · novelty 4.0

A new autoencoder model with margin-based discriminative embeddings and regressor feedback outperforms prior zero-shot learning methods on SUN, CUB, AWA1 and AWA2, with larger gains in generalized ZSL.

Two-stream Spatiotemporal Feature for Video QA Task

cs.CV · 2019-07-11 · unverdicted · novelty 4.0

A two-stream spatiotemporal feature extractor with squeeze-and-excitation and attention-based context matching improves text-only video QA on TVQA but shows limitations with visual features.

On Reducing Negative Jacobian Determinant of the Deformation Predicted by Deep Registration Networks

cs.CV · 2019-06-28 · unverdicted · novelty 4.0

Two training mechanisms for unsupervised deep registration networks reduce the number of locations with negative Jacobian determinants in predicted deformations.

Modern CNNs for IoT Based Farms

cs.CY · 2019-07-15 · unverdicted · novelty 2.0

A survey of state-of-the-art CNN architectures for agricultural IoT applications that proposes a tailored classification taxonomy and reviews existing research to guide architecture selection.

Deep learning in ultrasound imaging

eess.SP · 2019-07-05 · unverdicted · novelty 2.0

A review outlining deep learning strategies for adaptive beamforming, spectral Doppler, compressive color Doppler encodings, and structured signal recovery in ultrasound.

citing papers explorer

Showing 18 of 18 citing papers.

Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks cs.LG · 2015-11-19 · accept · none · ref 20
DCGANs with architectural constraints learn a hierarchy of representations from object parts to scenes in both generator and discriminator across image datasets.
Score-based Greedy Search for Structure Identification of Partially Observed Linear Causal Models cs.LG · 2025-10-05 · unverdicted · none · ref 13 · internal anchor
Introduces Generalized N Factor Model and LGES algorithm that identifies true causal structure including latents up to Markov equivalence class via score-based greedy search.
Locally Near Optimal Piecewise Linear Regression in High Dimensions via Difference of Max-Affine Functions stat.ML · 2026-05-07 · unverdicted · none · ref 184
ABGD parametrizes piecewise linear functions as difference of max-affine functions and converges linearly to an epsilon-accurate solution with O(d max(sigma/epsilon,1)^2) samples under sub-Gaussian noise, which is minimax optimal up to logs.
Searching for Activation Functions cs.NE · 2017-10-16 · conditional · none · ref 19
Automated search discovers Swish activation f(x) = x * sigmoid(βx) that improves top-1 ImageNet accuracy over ReLU by 0.9% on Mobile NASNet-A and 0.6% on Inception-ResNet-v2.
On Divergence Measures for Training GFlowNets cs.LG · 2024-10-12 · unverdicted · none · ref 98 · internal anchor
Introduces statistically efficient estimators for Renyi-α, Tsallis-α, reverse and forward KL divergences with REINFORCE and score-matching control variates for faster GFlowNet training.
Improving Description-based Person Re-identification by Multi-granularity Image-text Alignments cs.CV · 2019-06-23 · unverdicted · none · ref 40 · internal anchor
The MIA model with GC, RGA, and BFM modules achieves state-of-the-art performance on the CUHK-PEDES dataset for description-based person re-identification.
Materialistic RIR: Material Conditioned Realistic RIR Generation cs.CV · 2026-04-22 · unverdicted · none · ref 79
A two-module neural model disentangles spatial layout from material properties to generate controllable and more realistic room impulse responses, reporting gains of up to 16% on acoustic metrics and 70% on material metrics plus better human ratings.
Functional Similarity Metric for Neural Networks: Overcoming Parametric Ambiguity via Activation Region Analysis cs.LG · 2026-04-04 · unverdicted · none · ref 38
A functional similarity metric for ReLU networks uses normalized activation region signatures and MinHash to overcome parametric symmetries like neuron permutation and scaling.
Activation Function Design Sustains Plasticity in Continual Learning cs.LG · 2025-09-26 · unverdicted · none · ref 26 · internal anchor
Smooth-Leaky and Randomized Smooth-Leaky activations mitigate loss of plasticity in continual learning by targeting negative-branch shape and saturation behavior.
Gamma-Ray Burst Light Curve Reconstruction: A Comparative Machine and Deep Learning Analysis astro-ph.HE · 2024-12-28 · unverdicted · none · ref 101 · internal anchor
MLP and Attention U-Net outperform other models in reconstructing GRB light curves on 521 events, cutting plateau parameter uncertainties by 37-41% versus the Willingale baseline while achieving low MSE.
High-throughput Onboard Hyperspectral Image Compression with Ground-based CNN Reconstruction eess.IV · 2019-07-05 · unverdicted · none · ref 36 · internal anchor
Prequantization-based lossless predictive compression onboard hyperspectral images with CNN ground reconstruction recovers the entire SNR drop at 2 bpp.
Sparsity Hurts: Simple Linear Adapter Can Boost Generalized Category Discovery cs.CV · 2026-05-05 · unverdicted · none · ref 36
LAGCD inserts residual linear adapters into each ViT block plus a distribution alignment loss to improve generalized category discovery by increasing model flexibility while reducing bias between seen and novel classes.
Adaptive Reorganization of Neural Pathways for Continual Learning with Spiking Neural Networks cs.NE · 2023-09-18 · unverdicted · none · ref 67 · internal anchor
SOR-SNN employs Self-Organizing Regulation networks to reorganize a single SNN into sparse pathways, achieving better performance, energy efficiency, memory use, backward transfer, and self-repair on continual learning tasks including CIFAR100 and ImageNet.
Discriminative Embedding Autoencoder with a Regressor Feedback for Zero-Shot Learning cs.CV · 2019-07-18 · unverdicted · none · ref 30 · internal anchor
A new autoencoder model with margin-based discriminative embeddings and regressor feedback outperforms prior zero-shot learning methods on SUN, CUB, AWA1 and AWA2, with larger gains in generalized ZSL.
Two-stream Spatiotemporal Feature for Video QA Task cs.CV · 2019-07-11 · unverdicted · none · ref 30 · internal anchor
A two-stream spatiotemporal feature extractor with squeeze-and-excitation and attention-based context matching improves text-only video QA on TVQA but shows limitations with visual features.
On Reducing Negative Jacobian Determinant of the Deformation Predicted by Deep Registration Networks cs.CV · 2019-06-28 · unverdicted · none · ref 13 · internal anchor
Two training mechanisms for unsupervised deep registration networks reduce the number of locations with negative Jacobian determinants in predicted deformations.
Modern CNNs for IoT Based Farms cs.CY · 2019-07-15 · unverdicted · none · ref 1 · internal anchor
A survey of state-of-the-art CNN architectures for agricultural IoT applications that proposes a tailored classification taxonomy and reviews existing research to guide architecture selection.
Deep learning in ultrasound imaging eess.SP · 2019-07-05 · unverdicted · none · ref 100 · internal anchor
A review outlining deep learning strategies for adaptive beamforming, spectral Doppler, compressive color Doppler encodings, and structured signal recovery in ultrasound.

Empirical evalua- tion of rectified activations in convolutional network

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer