arxiv: 1610.04161 · v2 · pith:HVDMKQWVnew · submitted 2016-10-13 · 💻 cs.LG · cs.NE

Why Deep Neural Networks for Function Approximation?

Shiyu Liang , R. Srikant This is my paper

classification 💻 cs.LG cs.NE

keywords networksvarepsilondeepfunctionsneuralneuronsapproximationfunction

0 comments p. Extension

Add this Pith Number to your LaTeX paper

\usepackage{pith}
\pithnumber{HVDMKQWV}

Prints a linked pith:HVDMKQWV badge after your title and writes the identifier into PDF metadata. Compiles on arXiv with no extra files. Learn more

read the original abstract

Recently there has been much interest in understanding why deep neural networks are preferred to shallow networks. We show that, for a large class of piecewise smooth functions, the number of neurons needed by a shallow network to approximate a function is exponentially larger than the corresponding number of neurons needed by a deep network for a given degree of function approximation. First, we consider univariate functions on a bounded interval and require a neural network to achieve an approximation error of $\varepsilon$ uniformly over the interval. We show that shallow networks (i.e., networks whose depth does not depend on $\varepsilon$) require $\Omega(\text{poly}(1/\varepsilon))$ neurons while deep networks (i.e., networks whose depth grows with $1/\varepsilon$) require $\mathcal{O}(\text{polylog}(1/\varepsilon))$ neurons. We then extend these results to certain classes of important multivariate functions. Our results are derived for neural networks which use a combination of rectifier linear units (ReLUs) and binary step units, two of the most popular type of activation functions. Our analysis builds on a simple observation: the multiplication of two bits can be represented by a ReLU.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

MIDUS: Memory-Infused Depth Up-Scaling
cs.LG 2025-12 unverdicted novelty 7.0

MIDUS replaces duplicated FFN branches in depth up-scaling with head-wise memory layers using product-key retrieval and HIVE to deliver lightweight, head-conditioned residual capacity.
Implicit Neural Field-Based Process Planning for Multi-Axis Manufacturing: Direct Control over Collision Avoidance and Toolpath Geometry
cs.RO 2025-11 unverdicted novelty 6.0

Implicit neural fields enable joint optimization of manufacturing layers and toolpaths with explicit collision avoidance in a single differentiable pipeline for multi-axis processes.
Universal Representation of Generalized Convex Functions and their Gradients
math.OC 2025-08 unverdicted novelty 6.0

A new differentiable layer with convex parameter space universally approximates generalized convex functions and their gradients, enabling single-level reformulations of bilevel problems in optimal transport and multi...