Dynamical isometry (Jacobian singular values near 1) preserves plasticity in continual learning; an isometry-promoting regularizer and decoupled AdamO optimizer match or beat prior methods on supervised and RL benchmarks.
Norm-preserving Orthogonal Permutation Linear Unit Activation Functions (OPLU)
1 Pith paper cite this work. Polarity classification is still indexing.
abstract
We propose a novel activation function that implements piece-wise orthogonal non-linear mappings based on permutations. It is straightforward to implement, and very computationally efficient, also it has little memory requirements. We tested it on two toy problems for feedforward and recurrent networks, it shows similar performance to tanh and ReLU. OPLU activation function ensures norm preservance of the backpropagated gradients, therefore it is potentially good for the training of deep, extra deep, and recurrent neural networks.
fields
cs.LG 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Preserving Plasticity in Continual Learning via Dynamical Isometry
Dynamical isometry (Jacobian singular values near 1) preserves plasticity in continual learning; an isometry-promoting regularizer and decoupled AdamO optimizer match or beat prior methods on supervised and RL benchmarks.