pith. sign in

arxiv: 1906.06637 · v1 · pith:7QGH23A4new · submitted 2019-06-16 · 💻 cs.LG · math.OC· stat.ML

A Closer Look at Double Backpropagation

classification 💻 cs.LG math.OCstat.ML
keywords backpropagationderivativesdescriptiondoublefunctionsgeneralinputsloss
0
0 comments X
read the original abstract

In recent years, an increasing number of neural network models have included derivatives with respect to inputs in their loss functions, resulting in so-called double backpropagation for first-order optimization. However, so far no general description of the involved derivatives exists. Here, we cover a wide array of special cases in a very general Hilbert space framework, which allows us to provide optimized backpropagation rules for many real-world scenarios. This includes the reduction of calculations for Frobenius-norm-penalties on Jacobians by roughly a third for locally linear activation functions. Furthermore, we provide a description of the discontinuous loss surface of ReLU networks both in the inputs and the parameters and demonstrate why the discontinuities do not pose a big problem in reality.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Layer-wise Derivative Controlled Networks

    cs.LG 2026-05 unverdicted novelty 4.0

    ChainzRule with DREG regularization claims 15.5x fewer parameters than standard models, 23.1% lower peak gradient volatility on MNIST, and 70.17% accuracy on Yelp Full ordinal regression.