Norm-preserving Orthogonal Permutation Linear Unit Activation Functions (OPLU)
read the original abstract
We propose a novel activation function that implements piece-wise orthogonal non-linear mappings based on permutations. It is straightforward to implement, and very computationally efficient, also it has little memory requirements. We tested it on two toy problems for feedforward and recurrent networks, it shows similar performance to tanh and ReLU. OPLU activation function ensures norm preservance of the backpropagated gradients, therefore it is potentially good for the training of deep, extra deep, and recurrent neural networks.
This paper has not been read by Pith yet.
Forward citations
Cited by 1 Pith paper
-
Preserving Plasticity in Continual Learning via Dynamical Isometry
Dynamical isometry (Jacobian singular values near 1) preserves plasticity in continual learning; an isometry-promoting regularizer and decoupled AdamO optimizer match or beat prior methods on supervised and RL benchmarks.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.