pith. machine review for the scientific record. sign in

arxiv: 1412.6830 · v3 · submitted 2014-12-21 · 💻 cs.NE · cs.CV· cs.LG· stat.ML

Recognition: unknown

Learning Activation Functions to Improve Deep Neural Networks

Authors on Pith no claims yet
classification 💻 cs.NE cs.CVcs.LGstat.ML
keywords activationfunctionneuraldeepimprovelinearnetworksneuron
0
0 comments X
read the original abstract

Artificial neural networks typically have a fixed, non-linear activation function at each neuron. We have designed a novel form of piecewise linear activation function that is learned independently for each neuron using gradient descent. With this adaptive activation function, we are able to improve upon deep neural network architectures composed of static rectified linear units, achieving state-of-the-art performance on CIFAR-10 (7.51%), CIFAR-100 (30.83%), and a benchmark from high-energy physics involving Higgs boson decay modes.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Searching for Activation Functions

    cs.NE 2017-10 conditional novelty 7.0

    Automated search discovers Swish activation f(x) = x * sigmoid(βx) that improves top-1 ImageNet accuracy over ReLU by 0.9% on Mobile NASNet-A and 0.6% on Inception-ResNet-v2.

  2. Competing nonlinearities, criticality, and order-to-chaos transition in deep networks

    cond-mat.dis-nn 2026-05 unverdicted novelty 6.0

    A statistical mixture of Tanh and Swish activations with critical mixing fraction p_c induces a continuous phase transition to scale-invariant signal propagation in deep networks while preserving smoothness.