pith. sign in

arxiv: 1512.03965 · v4 · pith:GOP36NZVnew · submitted 2015-12-12 · 💻 cs.LG · cs.NE· stat.ML

The Power of Depth for Feedforward Neural Networks

classification 💻 cs.LG cs.NEstat.ML
keywords feedforwardnetworksneuraldepthfunctionslayerresultwidth
0
0 comments X
read the original abstract

We show that there is a simple (approximately radial) function on $\reals^d$, expressible by a small 3-layer feedforward neural networks, which cannot be approximated by any 2-layer network, to more than a certain constant accuracy, unless its width is exponential in the dimension. The result holds for virtually all known activation functions, including rectified linear units, sigmoids and thresholds, and formally demonstrates that depth -- even if increased by 1 -- can be exponentially more valuable than width for standard feedforward neural networks. Moreover, compared to related results in the context of Boolean functions, our result requires fewer assumptions, and the proof techniques and construction are very different.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. A Measure-Theoretic Analysis of Reasoning: Structural Generalization and Approximation Limits

    cs.LG 2026-05 unverdicted novelty 5.0

    Applies optimal transport to bound OOD generalization error in Transformers via Lipschitz continuity and TC^0 circuit depth lower bounds for Dyck-k backtracking, supported by evaluations on 54 configurations.