Deep neural network approximation theory for high-dimensional functions

Arnulf Jentzen; Benno Kuckuck; Patrick Cheridito; Pierfrancesco Beneventano; Robin Graeber

arxiv: 2112.14523 · v2 · submitted 2021-12-29 · 🧮 math.NA · cs.NA

Deep neural network approximation theory for high-dimensional functions

Pierfrancesco Beneventano , Patrick Cheridito , Robin Graeber , Arnulf Jentzen , Benno Kuckuck This is my paper

Pith reviewed 2026-05-24 12:34 UTC · model grok-4.3

classification 🧮 math.NA cs.NA

keywords deep neural networksapproximation theoryhigh-dimensional functionscurse of dimensionalityexpressive powerfunction compositionlocally Lipschitz functionsmaximum and product functions

0 comments

The pith

Deep neural networks can approximate high-dimensional functions composed of locally Lipschitz functions, maxima, and products using polynomially many parameters in the dimension and error.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a framework to analyze the approximation power of deep neural networks for high-dimensional functions. It introduces approximation spaces consisting of sequences of functions that can be built from locally Lipschitz continuous functions, maxima, and products through finite compositions. The authors prove that these spaces are closed under the relevant operations and that DNNs can approximate functions in these spaces with a number of parameters that grows at most polynomially in the input dimension and the reciprocal of the approximation error. This establishes that DNNs have sufficient expressive power to overcome the curse of dimensionality for this class of functions on compact sets.

Core claim

DNNs have sufficient expressive power to approximate, without the curse of dimensionality, certain sequences of functions which can be constructed by means of a finite number of compositions using locally Lipschitz continuous functions, maxima, and products. The number of parameters necessary to represent the approximating DNNs grows at most polynomially in 1/ε and in the input dimension d.

What carries the argument

Approximation spaces of function sequences that are closed under finite compositions with locally Lipschitz continuous functions, maxima, and products, allowing the combination of DNN approximation bounds for the individual operations.

If this is right

The parameter count for DNN approximations of such functions scales polynomially rather than exponentially with dimension.
Approximations hold on compact sets for any prescribed error ε > 0.
The result combines existing bounds for basic operations to cover their compositions.
Functions in these spaces can be approximated efficiently by DNNs without the curse of dimensionality.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

This suggests a method to certify efficient approximability by checking if a given function can be expressed through such compositions.
It may be possible to enlarge the class by including other operations that admit similar polynomial bounds.
High-dimensional problems in applications could be addressed if their solutions fall into these approximation spaces.

Load-bearing premise

The functions belong to the closure of the basic operations under composition and the parameter bounds for the basic operations extend to the composed functions without introducing exponential factors in the number of compositions.

What would settle it

Constructing a sequence of functions using only the allowed operations for which the minimal number of DNN parameters needed to achieve error ε grows exponentially with the dimension d.

Figures

Figures reproduced from arXiv: 2112.14523 by Arnulf Jentzen, Benno Kuckuck, Patrick Cheridito, Pierfrancesco Beneventano, Robin Graeber.

read the original abstract

The purpose of this article is to develop a machinery to study the capacity of deep neural networks (DNNs) to approximate high-dimensional functions. In particular, we show that DNNs have the expressive power to overcome the curse of dimensionality in the approximation of a large class of functions. More precisely, we prove that these functions can be approximated by DNNs on compact sets such that the number of parameters necessary to represent the approximating DNNs grows at most polynomially in the reciprocal $1/\varepsilon$ of the prescribed approximation error $\varepsilon>0$ and in the input dimension $d\in\mathbb N$. To this end, we introduce certain approximation spaces, consisting of sequences of functions that can be efficiently approximated by DNNs. We then establish closure properties which we combine with known and new bounds on the number of parameters necessary to approximate locally Lipschitz continuous functions, maximum functions, and product functions by DNNs. The main result of this article demonstrates that DNNs have sufficient expressive power to approximate, without the curse of dimensionality, certain sequences of functions which can be constructed by means of a finite number of compositions using locally Lipschitz continuous functions, maxima, and products.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper defines approximation spaces closed under locally Lipschitz maps, max, and products, then tracks parameter counts to get polynomial DNN bounds in d and 1/ε for those constructed functions.

read the letter

The central point is that the authors introduce approximation spaces for sequences of functions built by finite compositions of locally Lipschitz continuous functions, maxima, and products. They prove these spaces are closed under the operations and combine that with known DNN bounds on the basic pieces to conclude that the total parameter count stays polynomial in both dimension d and reciprocal error 1/ε, with the polynomial degree allowed to depend on the fixed construction depth. This is the main new contribution: the closure properties that preserve the polynomial scaling. Prior results on approximating the individual operations are cited and extended only as needed. The stress-test note confirms that the tracking of constants through the closures does not introduce hidden exponential factors in d or 1/ε, which matches what the abstract claims. The argument is therefore internally consistent on its own terms. The result is limited to functions that sit inside these specific spaces, so it does not claim to cover arbitrary high-dimensional targets; that restriction is explicit and not overstated. The degree of the polynomial growing with depth is also stated, which keeps the claim accurate rather than sweeping. This work is aimed at researchers in numerical analysis and neural network approximation theory who care about explicit high-dimensional rates. It is a precise, incremental advance that builds directly on existing bounds without circularity or post-hoc fitting. A serious editor should send it to peer review so the full proofs and any edge cases in the closure arguments can be checked in detail.

Referee Report

0 major / 3 minor

Summary. The paper introduces approximation spaces consisting of sequences of functions that can be constructed via a finite number of compositions of locally Lipschitz continuous functions, maxima, and products. It proves closure properties of these spaces under the listed operations and combines them with explicit bounds (new and previously published) on the number of DNN parameters required to approximate the basic operations, establishing that the resulting DNN approximants have parameter counts that scale at most polynomially in both the input dimension d and the reciprocal error 1/ε (with the polynomial degree permitted to depend on the fixed construction depth).

Significance. If the central claims hold, the work supplies a concrete, parameter-counting framework that identifies a broad class of high-dimensional functions approximable by DNNs without the curse of dimensionality. The explicit tracking of parameter counts through the closure operations, together with the combination of new and existing bounds on the elementary operations, constitutes a useful addition to the approximation theory of neural networks.

minor comments (3)

The abstract states the main theorem but does not indicate where the explicit parameter bounds for the basic operations (locally Lipschitz, max, product) are proved or referenced; a pointer to the relevant lemmas or prior works in the introduction would improve readability.
Notation for the approximation spaces (e.g., how the sequences are indexed and how the closure is formally defined) is introduced only in the body; a short definitional paragraph or diagram in §1 would help readers track the construction depth.
The dependence of the polynomial degree on the (fixed) construction depth is stated in the abstract but should be restated explicitly when the main theorem is formulated, to avoid any ambiguity about uniformity in depth.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive assessment of the manuscript, the clear summary of its contributions, and the recommendation for minor revision. No specific major comments appear in the provided report, so we have no individual points requiring rebuttal or clarification at this stage. We will proceed with preparing a revised version incorporating any minor editorial suggestions that may arise during the revision process.

Circularity Check

0 steps flagged

No significant circularity

full rationale

The derivation introduces approximation spaces via explicit closure under locally Lipschitz functions, maxima, and products, then tracks parameter counts through these operations to obtain polynomial bounds in 1/ε and d. Bounds on the base operations are stated as either previously known or newly derived within the paper, with the closure lemmas providing an independent counting argument that does not reduce any final bound to a fitted quantity or to a self-referential definition. The central claim therefore rests on the explicit construction and parameter tracking rather than on any circular reduction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review; no explicit free parameters, axioms, or invented entities are visible. The central claim rests on the existence of the approximation spaces and the validity of the cited bounds for basic operations.

pith-pipeline@v0.9.0 · 5751 in / 1083 out tokens · 19919 ms · 2026-05-24T12:34:05.874356+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

103 extracted references · 103 canonical work pages · 3 internal anchors

[1]

L., Romero-L ´opez, D., and Voigtlaender, F

Almira, J., de Teruel, P. L., Romero-L ´opez, D., and Voigtlaender, F. Negative results for approximation using single layer and multilayer fe edforward neural networks. J. Math. Anal. Appl. 494 , 1 (2021), 124584. Early access version available online

work page 2021
[2]

Breaking the curse of dimensionality with convex neural networks

Bach, F. Breaking the curse of dimensionality with convex neural networks. J. Mach. Learn. Res. 18 , 19 (2017), 53 pages

work page 2017
[3]

Neural net approximation

Barron, A. Neural net approximation. In Proceedings of the 7th Yale Workshop on Adaptive and Learning Systems (1992), pp. 69–72

work page 1992
[4]

Barron, A. R. Universal approximation bounds for superpositions of a sigmoidal function. IEEE Trans. Inform. Theory 39 , 3 (1993), 930–945

work page 1993
[5]

Barron, A. R. Approximation and estimation bounds for artiﬁcial neural net- works. Mach. Learn. 14 , 1 (1994), 115–133. 73

work page 1994
[6]

Deep Splitting Method for Parabolic PDEs

Beck, C., Becker, S., Cheridito, P., Jentzen, A., and Neufeld, A. Deep Splitting Method for Parabolic PDEs. SIAM J. Sci. Comput. 43 , 5 (2021), A3135– A3154

work page 2021
[7]

An overview on deep learning-based approximation methods for partial diﬀeren tial equations

Beck, C., Hutzenthaler, M., Jentzen, A., and Kuckuck, B. An overview on deep learning-based approximation methods for partial diﬀeren tial equations. arXiv:2012.12348 (2020), 22 pages. Revision requested from Discrete Contin. Dyn. Syst. Ser. B

work page arXiv 2012
[8]

Overall error analysis for the training of deep neural netwo rks via stochastic gradient descent with random initialisation

Beck, C., Jentzen, A., and Kuckuck, B. Full error analysis for the training of deep neural networks. arXiv:1910.00121 (2019), 43 pages. To appear in Inﬁn. Dimens. Anal. Quantum Probab. Relat. Top

work page arXiv 1910
[9]

Dynamic Programming

Bellman, R. Dynamic Programming . Princeton Landmarks in Mathematics. Princeton University Press, Princeton, NJ, 2010. Reprint of the 1 957 edition

work page 2010
[10]

High-dimensional approximation spaces of artiﬁcial neural networ ks and appli- cations to partial diﬀerential equations

Beneventano, P., Cheridito, P., Jentzen, A., and von Wurstem berger, P. High-dimensional approximation spaces of artiﬁcial neural networ ks and appli- cations to partial diﬀerential equations. arXiv:2012.04326 (2020), 32 pages

work page arXiv 2012
[11]

Berner, J., Grohs, P., and Jentzen, A. Analysis of the generalization error: empirical risk minimization over deep artiﬁcial neural networks over comes the curse of dimensionality in the numerical approximation of Black-Scholes par tial diﬀeren- tial equations. SIAM J. Math. Data Sci. 2 , 3 (2020), 631–657

work page 2020
[12]

K., and Li, L

Blum, E. K., and Li, L. K. Approximation theory and feedforward networks. Neural Netw. 4 , 4 (1991), 511–515

work page 1991
[13]

Optimal ap- proximation with sparsely connected deep neural networks

B¨olcskei, H., Grohs, P., Kutyniok, G., and Petersen, P. Optimal ap- proximation with sparsely connected deep neural networks. SIAM J. Math. Data Sci. 1 , 1 (2019), 8–45

work page 2019
[14]

Hinging hyperplanes for regression, classiﬁcation, and function ap - proximation

Breiman, L. Hinging hyperplanes for regression, classiﬁcation, and function ap - proximation. IEEE Trans. Inform. Theory 39 , 3 (1993), 999–1013

work page 1993
[15]

Error bounds for approximation with neural networks

Burger, M., and Neubauer, A. Error bounds for approximation with neural networks. J. Approx. Theory 112 , 2 (2001), 235–250

work page 2001
[16]

Candes, E. J. Ridgelets: Theory and applications . ProQuest LLC, Ann Arbor, MI, 1998. Ph.D. Thesis, Stanford University

work page 1998
[17]

Neural network approx- imation and estimation of classiﬁers with classiﬁcation boundary in a Ba rron class

Caragea, A., Petersen, P., and Voigtlaender, F. Neural network approx- imation and estimation of classiﬁers with classiﬁcation boundary in a Ba rron class. arXiv:2011.09363 (2020), 39 pages. 74

work page arXiv 2011
[18]

Construction of neural nets using the radon transform

Carroll, and Dickinson . Construction of neural nets using the radon transform. In International 1989 Joint Conference on Neural Networks (1989), vol. 1, pp. 607– 611

work page 1989
[19]

Approximation capability to functions of several vari- ables, nonlinear functionals, and operators by radial basis functio n neural networks

Chen, T., and Chen, H. Approximation capability to functions of several vari- ables, nonlinear functionals, and operators by radial basis functio n neural networks. IEEE Trans. on Neural Networks 6 , 4 (1995), 904–910

work page 1995
[20]

Eﬃcient approximation of high-dimensional functions with neural networks

Cheridito, P., Jentzen, A., and Rossmannek, F. Eﬃcient approximation of high-dimensional functions with neural networks. IEEE Trans. Neural Netw. Learn. Syst. (2021), 15 pages. Early access version available online

work page 2021
[21]

K., Li, X., and Mhaskar, H

Chui, C. K., Li, X., and Mhaskar, H. N. Neural networks for localized approximation. Math. Comp. 63 , 208 (1994), 607–623

work page 1994
[22]

K., Lin, S.-B., and Zhou, D.-X

Chui, C. K., Lin, S.-B., and Zhou, D.-X. Deep neural networks for rotation- invariance approximation and learning. Anal. Appl. (Singap.) 17 , 5 (2019), 737–772

work page 2019
[23]

On the expressive power of deep learning: A tensor analysis

Cohen, N., Sharir, O., and Shashua, A. On the expressive power of deep learning: A tensor analysis. In 29th Annual Conference on Learning Theory (23–26 Jun 2016), V. Feldman, A. Rakhlin, and O. Shamir, Eds., vol. 49 of Proceedings of Machine Learning Research, PMLR, pp. 698–728

work page 2016
[24]

Approximation by superpositions of a sigmoidal function

Cybenko, G. Approximation by superpositions of a sigmoidal function. Math. Control Signals Systems 2 , 4 (1989), 303–314

work page 1989
[25]

Depth separation for neural networks

Daniely, A. Depth separation for neural networks. In Proceedings of the 2017 Conference on Learning Theory (07–10 Jul 2017), S. Kale and O. Shamir, Eds., vol. 65 of Proceedings of Machine Learning Research , PMLR, pp. 690–696

work page 2017
[26]

A., Oskolkov, K

DeVore, R. A., Oskolkov, K. I., and Petrushev, P. P. Approximation by feed-forward neural networks. Ann. Numer. Math. 4 , 1–4 (1997), 261–287. The heritage of P. L. Chebyshev: a Festschrift in honor of the 70th bir thday of T. J. Rivlin

work page 1997
[27]

J., Gurvits, L., Darken, C., and Sontag, E

Donahue, M. J., Gurvits, L., Darken, C., and Sontag, E. Rates of convex approximation in non-Hilbert spaces. Constr. Approx. 13 , 2 (1997), 187–220

work page 1997
[28]

Deep learning-based numerical methods for high-dimensional parabolic partial diﬀerential equations and backw ard stochastic diﬀerential equations

E, W., Han, J., and Jentzen, A. Deep learning-based numerical methods for high-dimensional parabolic partial diﬀerential equations and backw ard stochastic diﬀerential equations. Commun. Math. Stat. 5 , 4 (2017), 349–380

work page 2017
[29]

Algorithms for solving high dimensional PDEs: from nonlinear monte carlo to machine learning

E, W., Han, J., and Jentzen, A. Algorithms for solving high dimensional PDEs: from nonlinear monte carlo to machine learning. Nonlinearity 35 , 1 (2021), 278–310. 75

work page 2021
[30]

Exponential convergence of the deep neural network approximation for analytic functions

E, W., and W ang, Q. Exponential convergence of the deep neural network approximation for analytic functions. Science China Mathematics 61 , 10 (2018), 1733–1740

work page 2018
[31]

DNN expression rate analysis of high-dimensional PDEs: Application to option pricing

Elbr¨achter, D., Grohs, P., Jentzen, A., and Schwab, C. DNN expression rate analysis of high-dimensional PDEs: Application to option pricing. Constr. Approx. (2021), 69 pages. Early access version available online

work page 2021
[32]

Deep neural network approximation theory

Elbr¨achter, D., Perekrestenko, D., Grohs, P., and B ¨olcskei, H. Deep neural network approximation theory. IEEE Trans. Inform. Theory 67 , 5 (2021), 2581–2623

work page 2021
[33]

The power of depth for feedforward neural net- works

Eldan, R., and Shamir, O. The power of depth for feedforward neural net- works. In 29th Annual Conference on Learning Theory (23–26 Jun 2016), V. Feld- man, A. Rakhlin, and O. Shamir, Eds., vol. 49 of Proceedings of Machine Learning Research, PMLR, pp. 907–940

work page 2016
[34]

Ellacott, S. W. Aspects of the numerical analysis of neural networks. Acta Numer. 3 (1994), 145–202

work page 1994
[35]

On the approximate realization of continuous mappings by neural networks

Funahashi, K.-I. On the approximate realization of continuous mappings by neural networks. Neural Netw. 2 , 3 (1989), 183–192

work page 1989
[36]

There exists a neural network that does not make avoid- able mistakes

Gallant, and White . There exists a neural network that does not make avoid- able mistakes. In IEEE 1988 International Conference on Neural Networks (1988), vol. 1, pp. 657–664

work page 1988
[37]

Uniform error estimates for artiﬁcial neural network approximations for heat equations

Gonon, L., Grohs, P., Jentzen, A., Kofler, D., and ˇSiˇska, D. Uniform error estimates for artiﬁcial neural network approximations for heat equations. IMA J. Numer. Anal. (2021), drab027. Early access version available online

work page 2021
[38]

Deep ReLU network expression rates for option prices in high-dimensional, exponential L´ evy models

Gonon, L., and Schwab, C. Deep ReLU network expression rates for option prices in high-dimensional, exponential L´ evy models. Tech. Rep. 20 20-52, Seminar for Applied Mathematics, ETH Z¨ urich, Switzerland, 2020

work page 2020
[39]

Ap- proximation spaces of deep neural networks

Gribonval, R., Kutyniok, G., Nielsen, M., and Voigtlaender, F. Ap- proximation spaces of deep neural networks. arXiv:1905.01208 (2019), 63 pages

work page arXiv 1905
[40]

Deep neural network approximation for high- dimensional elliptic PDEs with boundary conditions

Grohs, P., and Herrmann, L. Deep neural network approximation for high- dimensional elliptic PDEs with boundary conditions. arXiv:2007.05384 (2020), 22 pages

work page arXiv 2007
[41]

A proof that artiﬁcial neural networks overcome the curse of dim ensionality 76 in the numerical approximation of Black–Scholes partial diﬀerential equations

Grohs, P., Hornung, F., Jentzen, A., and Von Wurstemberger, P. A proof that artiﬁcial neural networks overcome the curse of dim ensionality 76 in the numerical approximation of Black–Scholes partial diﬀerential equations. arXiv:1809.02362 (2018), 124 pages. To appear in Mem. Amer. Math. Soc

work page arXiv 2018
[42]

Space-time error estimates for deep neural network approximations for diﬀe rential equations

Grohs, P., Hornung, F., Jentzen, A., and Zimmermann, P. Space-time error estimates for deep neural network approximations for diﬀe rential equations. arXiv:1908.03833 (2019), 86 pages. Revision requested from Adv. Comput. Math

work page arXiv 1908
[43]

Lower bounds for artiﬁcial neural network approximations: A proof tha t shallow neural networks fail to overcome the curse of dimensionality

Grohs, P., Ibragimov, S., Jentzen, A., and Koppensteiner, S. Lower bounds for artiﬁcial neural network approximations: A proof tha t shallow neural networks fail to overcome the curse of dimensionality. arXiv:2103.04488 (2021), 53 pages. Revision requested from J. Complexity

work page arXiv 2021
[44]

Deep neural network approxima- tions for Monte Carlo algorithms

Grohs, P., Jentzen, A., and Salimova, D. Deep neural network approxima- tions for Monte Carlo algorithms. arXiv:1908.10828 (2019), 45 pages. To appear in Partial Diﬀer. Equ. Appl

work page arXiv 1908
[45]

Proof of the theory-to-practice gap in deep learning via sampling complexity bounds for neural network approxim ation spaces

Grohs, P., and Voigtlaender, F. Proof of the theory-to-practice gap in deep learning via sampling complexity bounds for neural network approxim ation spaces. arXiv:2104.02746 (2021), 42 pages

work page arXiv 2021
[46]

Error bounds for approximations with deep ReLU neural networks in $W^{s,p}$ norms

G¨uhring, I., Kutyniok, G., and Petersen, P. Error bounds for approxima- tions with deep ReLU neural networks in Ws,p norms. arXiv:1902.07896 (2019), 42 pages

work page internal anchor Pith review Pith/arXiv arXiv 1902
[47]

Theory of Deep Learning

G¨uhring, I., Raslan, M., and Kutyniok, G. Expressivity of Deep Neural Networks. arXiv:2007.04759 (2020), 37 pages. To appear as a chapter in the book “Theory of Deep Learning” by Cambridge University Press

work page arXiv 2007
[48]

J., and Ismailov, V

Guliyev, N. J., and Ismailov, V. E. Approximation capability of two hidden layer feedforward neural networks with ﬁxed weights. Neurocomputing 316 (2018), 262–269

work page 2018
[49]

J., and Ismailov, V

Guliyev, N. J., and Ismailov, V. E. On the approximation by single hidden layer feedforward neural networks with ﬁxed weights. Neural Netw. 98 (2018), 296–304

work page 2018
[50]

Solving high-dimensional partial diﬀerential equations using deep learning

Han, J., Jentzen, A., and E, W. Solving high-dimensional partial diﬀerential equations using deep learning. Proc. Natl. Acad. Sci. USA 115 , 34 (2018), 8505– 8510

work page 2018
[51]

Universal function approximation by deep neural nets with bounded width and relu activations

Hanin, B. Universal function approximation by deep neural nets with bounde d width and ReLU activations. arXiv:1708.02691 (2017), 9 pages

work page arXiv 2017
[52]

The randomized information complexity of elliptic PDE

Heinrich, S. The randomized information complexity of elliptic PDE. J. Com- plexity 22 , 2 (2006), 220–249. 77

work page 2006
[53]

Monte Carlo complexity of parametric inte- gration

Heinrich, S., and Sindambiwe, E. Monte Carlo complexity of parametric inte- gration. J. Complexity 15 , 3 (1999), 317–341

work page 1999
[54]

Approximation capabilities of multilayer feedforward networks

Hornik, K. Approximation capabilities of multilayer feedforward networks. Neural Netw. 4 , 2 (1991), 251–257

work page 1991
[55]

Some new results on neural network approximation

Hornik, K. Some new results on neural network approximation. Neural Netw. 6 , 8 (1993), 1069–1072

work page 1993
[56]

Multilayer feedforward net- works are universal approximators

Hornik, K., Stinchcombe, M., and White, H. Multilayer feedforward net- works are universal approximators. Neural Netw. 2 , 5 (1989), 359–366

work page 1989
[57]

Universal approximation of an unknown mapping and its derivatives using multilayer feedforward ne tworks

Hornik, K., Stinchcombe, M., and White, H. Universal approximation of an unknown mapping and its derivatives using multilayer feedforward ne tworks. Neural Netw. 3 , 5 (1990), 551–560

work page 1990
[58]

Space-time deep neu- ral network approximations for high-dimensional partial diﬀerent ial equations

Hornung, F., Jentzen, A., and Salimova, D. Space-time deep neu- ral network approximations for high-dimensional partial diﬀerent ial equations. arXiv:2006.02199 (2020), 52 pages

work page arXiv 2006
[59]

Hutzenthaler, M., Jentzen, A., Kruse, T., and Nguyen, T. A. A proof that rectiﬁed deep neural networks overcome the curse of dimen sionality in the numerical approximation of semilinear heat equations. Partial Diﬀer. Equ. Appl. 1 , 2 (2020), Paper No. 10, 34 pages

work page 2020
[60]

Capabilities of three-layered perceptrons

Irie, and Miyake . Capabilities of three-layered perceptrons. In IEEE 1988 In- ternational Conference on Neural Networks (1988), vol. 1, pp. 641–648

work page 1988
[61]

Strong overall error analysis for the training of artiﬁcial neural networks via random initializations

Jentzen, A., and Riekert, A. Strong overall error analysis for the training of artiﬁcial neural networks via random initializations. arXiv:2012.08443 (2020), 40 pages. Revision requested from Commun. Math. Stat

work page arXiv 2012
[62]

Jentzen, A., Salimova, D., and Welti, T. A proof that deep artiﬁcial neural networks overcome the curse of dimensionality in the numerical app roximation of Kolmogorov partial diﬀerential equations with constant diﬀusion an d nonlinear drift coeﬃcients. Commun. Math. Sci. 19 , 5 (2021), 1167–1205

work page 2021
[63]

Jones, L. K. A simple lemma on greedy approximation in Hilbert space and convergence rates for projection pursuit regression and neura l network training. Ann. Statist. 20 , 1 (1992), 608–613

work page 1992
[64]

C., K ˚urkov´a, V., and Sanguineti, M

Kainen, P. C., K ˚urkov´a, V., and Sanguineti, M. Complexity of Gaussian- radial-basis networks approximating smooth functions. J. Complexity 25 , 1 (2009), 63–74. 78

work page 2009
[65]

C., K ˚urkov´a, V., and Sanguineti, M

Kainen, P. C., K ˚urkov´a, V., and Sanguineti, M. Dependence of computa- tional models on input dimension: tractability of approximation and op timization tasks. IEEE Trans. Inform. Theory 58 , 2 (2012), 1203–1214

work page 2012
[66]

Universal Approximation with Deep Narrow Net- works

Kidger, P., and Lyons, T. Universal Approximation with Deep Narrow Net- works. In Proceedings of Thirty Third Conference on Learning Theory (09–12 Jul 2020), J. Abernethy and S. Agarwal, Eds., vol. 125 of Proceedings of Machine Learn- ing Research, PMLR, pp. 2306–2327

work page 2020
[67]

M., and Barron, A

Klusowski, J. M., and Barron, A. R. Approximation by combinations of ReLU and squared ReLU ridge functions with ℓ1 and ℓ0 controls. IEEE Trans. Inform. Theory 64 , 12 (2018), 7649–7656

work page 2018
[68]

Comparison of worst case errors in linear and neural network approximation

K˚urkov´a, V., and Sanguineti, M. Comparison of worst case errors in linear and neural network approximation. IEEE Trans. Inform. Theory 48 , 1 (2002), 264–275

work page 2002
[69]

Geometric upper bounds on rates of variable- basis approximation

K˚urkov´a, V., and Sanguineti, M. Geometric upper bounds on rates of variable- basis approximation. IEEE Trans. Inform. Theory 54 , 12 (2008), 5681–5688

work page 2008
[70]

C., and Kreinovich, V

K˚urkov´a, V., Kainen, P. C., and Kreinovich, V. Estimates of the Number of Hidden Units and Variation with Respect to Half-Spaces. Neural Netw. 10 , 6 (1997), 1061–1068

work page 1997
[71]

On the geometric convergence of neural approximations

Lavretsky, E. On the geometric convergence of neural approximations. IEEE Trans. on Neural Networks 13 , 2 (2002), 274–282

work page 2002
[72]

On the ability of neural nets to express distributions

Lee, H., Ge, R., Ma, T., Risteski, A., and Arora, S. On the ability of neural nets to express distributions. In Proceedings of the 2017 Conference on Learning Theory (07–10 Jul 2017), S. Kale and O. Shamir, Eds., vol. 65 of Proceedings of Machine Learning Research, PMLR, pp. 1271–1296

work page 2017
[73]

Y., Pinkus, A., and Schocken, S

Leshno, M., Lin, V. Y., Pinkus, A., and Schocken, S. Multilayer feed- forward networks with a nonpolynomial activation function can app roximate any function. Neural Netw. 6 , 6 (1993), 861–867

work page 1993
[74]

Better approximations of high dimensional smooth functions by deep neural networks with rectiﬁed power units

Li, B., Tang, S., and Yu, H. Better approximations of high dimensional smooth functions by deep neural networks with rectiﬁed power units. Commun. Comput. Phys. 27 , 2 (2020), 379–411

work page 2020
[75]

A note on the expressive power of deep rectiﬁed linear unit networks in high-dimensional spaces

Liang, C., and Wu, C. A note on the expressive power of deep rectiﬁed linear unit networks in high-dimensional spaces. Math. Methods Appl. Sci. 42 , 9 (2019), 3400–3404. 79

work page 2019
[76]

Deep network approximation for smooth functions

Lu, J., Shen, Z., Yang, H., and Zhang, S. Deep network approximation for smooth functions. arXiv2001.03040 (2020), 46 pages

work page arXiv 2020
[77]

Lower bounds for approximation by MLP neural networks

Maiorov, V., and Pinkus, A. Lower bounds for approximation by MLP neural networks. Neurocomputing 25, 1 (1999), 81–91

work page 1999
[78]

E., and Meir, R

Maiorov, V. E., and Meir, R. On the near optimality of the stochastic approx- imation of smooth functions by neural networks. Adv. Comput. Math. 13 , 1 (2000), 79–103

work page 2000
[79]

Random approximants and neural networks

Makovoz, Y. Random approximants and neural networks. J. Approx. Theory 85 , 1 (1996), 98–109

work page 1996
[80]

Uniform approximation by neural networks

Makovoz, Y. Uniform approximation by neural networks. J. Approx. Theory 95 , 2 (1998), 215–228

work page 1998

Showing first 80 references.

[1] [1]

L., Romero-L ´opez, D., and Voigtlaender, F

Almira, J., de Teruel, P. L., Romero-L ´opez, D., and Voigtlaender, F. Negative results for approximation using single layer and multilayer fe edforward neural networks. J. Math. Anal. Appl. 494 , 1 (2021), 124584. Early access version available online

work page 2021

[2] [2]

Breaking the curse of dimensionality with convex neural networks

Bach, F. Breaking the curse of dimensionality with convex neural networks. J. Mach. Learn. Res. 18 , 19 (2017), 53 pages

work page 2017

[3] [3]

Neural net approximation

Barron, A. Neural net approximation. In Proceedings of the 7th Yale Workshop on Adaptive and Learning Systems (1992), pp. 69–72

work page 1992

[4] [4]

Barron, A. R. Universal approximation bounds for superpositions of a sigmoidal function. IEEE Trans. Inform. Theory 39 , 3 (1993), 930–945

work page 1993

[5] [5]

Barron, A. R. Approximation and estimation bounds for artiﬁcial neural net- works. Mach. Learn. 14 , 1 (1994), 115–133. 73

work page 1994

[6] [6]

Deep Splitting Method for Parabolic PDEs

Beck, C., Becker, S., Cheridito, P., Jentzen, A., and Neufeld, A. Deep Splitting Method for Parabolic PDEs. SIAM J. Sci. Comput. 43 , 5 (2021), A3135– A3154

work page 2021

[7] [7]

An overview on deep learning-based approximation methods for partial diﬀeren tial equations

Beck, C., Hutzenthaler, M., Jentzen, A., and Kuckuck, B. An overview on deep learning-based approximation methods for partial diﬀeren tial equations. arXiv:2012.12348 (2020), 22 pages. Revision requested from Discrete Contin. Dyn. Syst. Ser. B

work page arXiv 2012

[8] [8]

Overall error analysis for the training of deep neural netwo rks via stochastic gradient descent with random initialisation

Beck, C., Jentzen, A., and Kuckuck, B. Full error analysis for the training of deep neural networks. arXiv:1910.00121 (2019), 43 pages. To appear in Inﬁn. Dimens. Anal. Quantum Probab. Relat. Top

work page arXiv 1910

[9] [9]

Dynamic Programming

Bellman, R. Dynamic Programming . Princeton Landmarks in Mathematics. Princeton University Press, Princeton, NJ, 2010. Reprint of the 1 957 edition

work page 2010

[10] [10]

High-dimensional approximation spaces of artiﬁcial neural networ ks and appli- cations to partial diﬀerential equations

Beneventano, P., Cheridito, P., Jentzen, A., and von Wurstem berger, P. High-dimensional approximation spaces of artiﬁcial neural networ ks and appli- cations to partial diﬀerential equations. arXiv:2012.04326 (2020), 32 pages

work page arXiv 2012

[11] [11]

Berner, J., Grohs, P., and Jentzen, A. Analysis of the generalization error: empirical risk minimization over deep artiﬁcial neural networks over comes the curse of dimensionality in the numerical approximation of Black-Scholes par tial diﬀeren- tial equations. SIAM J. Math. Data Sci. 2 , 3 (2020), 631–657

work page 2020

[12] [12]

K., and Li, L

Blum, E. K., and Li, L. K. Approximation theory and feedforward networks. Neural Netw. 4 , 4 (1991), 511–515

work page 1991

[13] [13]

Optimal ap- proximation with sparsely connected deep neural networks

B¨olcskei, H., Grohs, P., Kutyniok, G., and Petersen, P. Optimal ap- proximation with sparsely connected deep neural networks. SIAM J. Math. Data Sci. 1 , 1 (2019), 8–45

work page 2019

[14] [14]

Hinging hyperplanes for regression, classiﬁcation, and function ap - proximation

Breiman, L. Hinging hyperplanes for regression, classiﬁcation, and function ap - proximation. IEEE Trans. Inform. Theory 39 , 3 (1993), 999–1013

work page 1993

[15] [15]

Error bounds for approximation with neural networks

Burger, M., and Neubauer, A. Error bounds for approximation with neural networks. J. Approx. Theory 112 , 2 (2001), 235–250

work page 2001

[16] [16]

Candes, E. J. Ridgelets: Theory and applications . ProQuest LLC, Ann Arbor, MI, 1998. Ph.D. Thesis, Stanford University

work page 1998

[17] [17]

Neural network approx- imation and estimation of classiﬁers with classiﬁcation boundary in a Ba rron class

Caragea, A., Petersen, P., and Voigtlaender, F. Neural network approx- imation and estimation of classiﬁers with classiﬁcation boundary in a Ba rron class. arXiv:2011.09363 (2020), 39 pages. 74

work page arXiv 2011

[18] [18]

Construction of neural nets using the radon transform

Carroll, and Dickinson . Construction of neural nets using the radon transform. In International 1989 Joint Conference on Neural Networks (1989), vol. 1, pp. 607– 611

work page 1989

[19] [19]

Approximation capability to functions of several vari- ables, nonlinear functionals, and operators by radial basis functio n neural networks

Chen, T., and Chen, H. Approximation capability to functions of several vari- ables, nonlinear functionals, and operators by radial basis functio n neural networks. IEEE Trans. on Neural Networks 6 , 4 (1995), 904–910

work page 1995

[20] [20]

Eﬃcient approximation of high-dimensional functions with neural networks

Cheridito, P., Jentzen, A., and Rossmannek, F. Eﬃcient approximation of high-dimensional functions with neural networks. IEEE Trans. Neural Netw. Learn. Syst. (2021), 15 pages. Early access version available online

work page 2021

[21] [21]

K., Li, X., and Mhaskar, H

Chui, C. K., Li, X., and Mhaskar, H. N. Neural networks for localized approximation. Math. Comp. 63 , 208 (1994), 607–623

work page 1994

[22] [22]

K., Lin, S.-B., and Zhou, D.-X

Chui, C. K., Lin, S.-B., and Zhou, D.-X. Deep neural networks for rotation- invariance approximation and learning. Anal. Appl. (Singap.) 17 , 5 (2019), 737–772

work page 2019

[23] [23]

On the expressive power of deep learning: A tensor analysis

Cohen, N., Sharir, O., and Shashua, A. On the expressive power of deep learning: A tensor analysis. In 29th Annual Conference on Learning Theory (23–26 Jun 2016), V. Feldman, A. Rakhlin, and O. Shamir, Eds., vol. 49 of Proceedings of Machine Learning Research, PMLR, pp. 698–728

work page 2016

[24] [24]

Approximation by superpositions of a sigmoidal function

Cybenko, G. Approximation by superpositions of a sigmoidal function. Math. Control Signals Systems 2 , 4 (1989), 303–314

work page 1989

[25] [25]

Depth separation for neural networks

Daniely, A. Depth separation for neural networks. In Proceedings of the 2017 Conference on Learning Theory (07–10 Jul 2017), S. Kale and O. Shamir, Eds., vol. 65 of Proceedings of Machine Learning Research , PMLR, pp. 690–696

work page 2017

[26] [26]

A., Oskolkov, K

DeVore, R. A., Oskolkov, K. I., and Petrushev, P. P. Approximation by feed-forward neural networks. Ann. Numer. Math. 4 , 1–4 (1997), 261–287. The heritage of P. L. Chebyshev: a Festschrift in honor of the 70th bir thday of T. J. Rivlin

work page 1997

[27] [27]

J., Gurvits, L., Darken, C., and Sontag, E

Donahue, M. J., Gurvits, L., Darken, C., and Sontag, E. Rates of convex approximation in non-Hilbert spaces. Constr. Approx. 13 , 2 (1997), 187–220

work page 1997

[28] [28]

Deep learning-based numerical methods for high-dimensional parabolic partial diﬀerential equations and backw ard stochastic diﬀerential equations

E, W., Han, J., and Jentzen, A. Deep learning-based numerical methods for high-dimensional parabolic partial diﬀerential equations and backw ard stochastic diﬀerential equations. Commun. Math. Stat. 5 , 4 (2017), 349–380

work page 2017

[29] [29]

Algorithms for solving high dimensional PDEs: from nonlinear monte carlo to machine learning

E, W., Han, J., and Jentzen, A. Algorithms for solving high dimensional PDEs: from nonlinear monte carlo to machine learning. Nonlinearity 35 , 1 (2021), 278–310. 75

work page 2021

[30] [30]

Exponential convergence of the deep neural network approximation for analytic functions

E, W., and W ang, Q. Exponential convergence of the deep neural network approximation for analytic functions. Science China Mathematics 61 , 10 (2018), 1733–1740

work page 2018

[31] [31]

DNN expression rate analysis of high-dimensional PDEs: Application to option pricing

Elbr¨achter, D., Grohs, P., Jentzen, A., and Schwab, C. DNN expression rate analysis of high-dimensional PDEs: Application to option pricing. Constr. Approx. (2021), 69 pages. Early access version available online

work page 2021

[32] [32]

Deep neural network approximation theory

Elbr¨achter, D., Perekrestenko, D., Grohs, P., and B ¨olcskei, H. Deep neural network approximation theory. IEEE Trans. Inform. Theory 67 , 5 (2021), 2581–2623

work page 2021

[33] [33]

The power of depth for feedforward neural net- works

Eldan, R., and Shamir, O. The power of depth for feedforward neural net- works. In 29th Annual Conference on Learning Theory (23–26 Jun 2016), V. Feld- man, A. Rakhlin, and O. Shamir, Eds., vol. 49 of Proceedings of Machine Learning Research, PMLR, pp. 907–940

work page 2016

[34] [34]

Ellacott, S. W. Aspects of the numerical analysis of neural networks. Acta Numer. 3 (1994), 145–202

work page 1994

[35] [35]

On the approximate realization of continuous mappings by neural networks

Funahashi, K.-I. On the approximate realization of continuous mappings by neural networks. Neural Netw. 2 , 3 (1989), 183–192

work page 1989

[36] [36]

There exists a neural network that does not make avoid- able mistakes

Gallant, and White . There exists a neural network that does not make avoid- able mistakes. In IEEE 1988 International Conference on Neural Networks (1988), vol. 1, pp. 657–664

work page 1988

[37] [37]

Uniform error estimates for artiﬁcial neural network approximations for heat equations

Gonon, L., Grohs, P., Jentzen, A., Kofler, D., and ˇSiˇska, D. Uniform error estimates for artiﬁcial neural network approximations for heat equations. IMA J. Numer. Anal. (2021), drab027. Early access version available online

work page 2021

[38] [38]

Deep ReLU network expression rates for option prices in high-dimensional, exponential L´ evy models

Gonon, L., and Schwab, C. Deep ReLU network expression rates for option prices in high-dimensional, exponential L´ evy models. Tech. Rep. 20 20-52, Seminar for Applied Mathematics, ETH Z¨ urich, Switzerland, 2020

work page 2020

[39] [39]

Ap- proximation spaces of deep neural networks

Gribonval, R., Kutyniok, G., Nielsen, M., and Voigtlaender, F. Ap- proximation spaces of deep neural networks. arXiv:1905.01208 (2019), 63 pages

work page arXiv 1905

[40] [40]

Deep neural network approximation for high- dimensional elliptic PDEs with boundary conditions

Grohs, P., and Herrmann, L. Deep neural network approximation for high- dimensional elliptic PDEs with boundary conditions. arXiv:2007.05384 (2020), 22 pages

work page arXiv 2007

[41] [41]

A proof that artiﬁcial neural networks overcome the curse of dim ensionality 76 in the numerical approximation of Black–Scholes partial diﬀerential equations

Grohs, P., Hornung, F., Jentzen, A., and Von Wurstemberger, P. A proof that artiﬁcial neural networks overcome the curse of dim ensionality 76 in the numerical approximation of Black–Scholes partial diﬀerential equations. arXiv:1809.02362 (2018), 124 pages. To appear in Mem. Amer. Math. Soc

work page arXiv 2018

[42] [42]

Space-time error estimates for deep neural network approximations for diﬀe rential equations

Grohs, P., Hornung, F., Jentzen, A., and Zimmermann, P. Space-time error estimates for deep neural network approximations for diﬀe rential equations. arXiv:1908.03833 (2019), 86 pages. Revision requested from Adv. Comput. Math

work page arXiv 1908

[43] [43]

Lower bounds for artiﬁcial neural network approximations: A proof tha t shallow neural networks fail to overcome the curse of dimensionality

Grohs, P., Ibragimov, S., Jentzen, A., and Koppensteiner, S. Lower bounds for artiﬁcial neural network approximations: A proof tha t shallow neural networks fail to overcome the curse of dimensionality. arXiv:2103.04488 (2021), 53 pages. Revision requested from J. Complexity

work page arXiv 2021

[44] [44]

Deep neural network approxima- tions for Monte Carlo algorithms

Grohs, P., Jentzen, A., and Salimova, D. Deep neural network approxima- tions for Monte Carlo algorithms. arXiv:1908.10828 (2019), 45 pages. To appear in Partial Diﬀer. Equ. Appl

work page arXiv 1908

[45] [45]

Proof of the theory-to-practice gap in deep learning via sampling complexity bounds for neural network approxim ation spaces

Grohs, P., and Voigtlaender, F. Proof of the theory-to-practice gap in deep learning via sampling complexity bounds for neural network approxim ation spaces. arXiv:2104.02746 (2021), 42 pages

work page arXiv 2021

[46] [46]

Error bounds for approximations with deep ReLU neural networks in $W^{s,p}$ norms

G¨uhring, I., Kutyniok, G., and Petersen, P. Error bounds for approxima- tions with deep ReLU neural networks in Ws,p norms. arXiv:1902.07896 (2019), 42 pages

work page internal anchor Pith review Pith/arXiv arXiv 1902

[47] [47]

Theory of Deep Learning

G¨uhring, I., Raslan, M., and Kutyniok, G. Expressivity of Deep Neural Networks. arXiv:2007.04759 (2020), 37 pages. To appear as a chapter in the book “Theory of Deep Learning” by Cambridge University Press

work page arXiv 2007

[48] [48]

J., and Ismailov, V

Guliyev, N. J., and Ismailov, V. E. Approximation capability of two hidden layer feedforward neural networks with ﬁxed weights. Neurocomputing 316 (2018), 262–269

work page 2018

[49] [49]

J., and Ismailov, V

Guliyev, N. J., and Ismailov, V. E. On the approximation by single hidden layer feedforward neural networks with ﬁxed weights. Neural Netw. 98 (2018), 296–304

work page 2018

[50] [50]

Solving high-dimensional partial diﬀerential equations using deep learning

Han, J., Jentzen, A., and E, W. Solving high-dimensional partial diﬀerential equations using deep learning. Proc. Natl. Acad. Sci. USA 115 , 34 (2018), 8505– 8510

work page 2018

[51] [51]

Universal function approximation by deep neural nets with bounded width and relu activations

Hanin, B. Universal function approximation by deep neural nets with bounde d width and ReLU activations. arXiv:1708.02691 (2017), 9 pages

work page arXiv 2017

[52] [52]

The randomized information complexity of elliptic PDE

Heinrich, S. The randomized information complexity of elliptic PDE. J. Com- plexity 22 , 2 (2006), 220–249. 77

work page 2006

[53] [53]

Monte Carlo complexity of parametric inte- gration

Heinrich, S., and Sindambiwe, E. Monte Carlo complexity of parametric inte- gration. J. Complexity 15 , 3 (1999), 317–341

work page 1999

[54] [54]

Approximation capabilities of multilayer feedforward networks

Hornik, K. Approximation capabilities of multilayer feedforward networks. Neural Netw. 4 , 2 (1991), 251–257

work page 1991

[55] [55]

Some new results on neural network approximation

Hornik, K. Some new results on neural network approximation. Neural Netw. 6 , 8 (1993), 1069–1072

work page 1993

[56] [56]

Multilayer feedforward net- works are universal approximators

Hornik, K., Stinchcombe, M., and White, H. Multilayer feedforward net- works are universal approximators. Neural Netw. 2 , 5 (1989), 359–366

work page 1989

[57] [57]

Universal approximation of an unknown mapping and its derivatives using multilayer feedforward ne tworks

Hornik, K., Stinchcombe, M., and White, H. Universal approximation of an unknown mapping and its derivatives using multilayer feedforward ne tworks. Neural Netw. 3 , 5 (1990), 551–560

work page 1990

[58] [58]

Space-time deep neu- ral network approximations for high-dimensional partial diﬀerent ial equations

Hornung, F., Jentzen, A., and Salimova, D. Space-time deep neu- ral network approximations for high-dimensional partial diﬀerent ial equations. arXiv:2006.02199 (2020), 52 pages

work page arXiv 2006

[59] [59]

Hutzenthaler, M., Jentzen, A., Kruse, T., and Nguyen, T. A. A proof that rectiﬁed deep neural networks overcome the curse of dimen sionality in the numerical approximation of semilinear heat equations. Partial Diﬀer. Equ. Appl. 1 , 2 (2020), Paper No. 10, 34 pages

work page 2020

[60] [60]

Capabilities of three-layered perceptrons

Irie, and Miyake . Capabilities of three-layered perceptrons. In IEEE 1988 In- ternational Conference on Neural Networks (1988), vol. 1, pp. 641–648

work page 1988

[61] [61]

Strong overall error analysis for the training of artiﬁcial neural networks via random initializations

Jentzen, A., and Riekert, A. Strong overall error analysis for the training of artiﬁcial neural networks via random initializations. arXiv:2012.08443 (2020), 40 pages. Revision requested from Commun. Math. Stat

work page arXiv 2012

[62] [62]

Jentzen, A., Salimova, D., and Welti, T. A proof that deep artiﬁcial neural networks overcome the curse of dimensionality in the numerical app roximation of Kolmogorov partial diﬀerential equations with constant diﬀusion an d nonlinear drift coeﬃcients. Commun. Math. Sci. 19 , 5 (2021), 1167–1205

work page 2021

[63] [63]

Jones, L. K. A simple lemma on greedy approximation in Hilbert space and convergence rates for projection pursuit regression and neura l network training. Ann. Statist. 20 , 1 (1992), 608–613

work page 1992

[64] [64]

C., K ˚urkov´a, V., and Sanguineti, M

Kainen, P. C., K ˚urkov´a, V., and Sanguineti, M. Complexity of Gaussian- radial-basis networks approximating smooth functions. J. Complexity 25 , 1 (2009), 63–74. 78

work page 2009

[65] [65]

C., K ˚urkov´a, V., and Sanguineti, M

Kainen, P. C., K ˚urkov´a, V., and Sanguineti, M. Dependence of computa- tional models on input dimension: tractability of approximation and op timization tasks. IEEE Trans. Inform. Theory 58 , 2 (2012), 1203–1214

work page 2012

[66] [66]

Universal Approximation with Deep Narrow Net- works

Kidger, P., and Lyons, T. Universal Approximation with Deep Narrow Net- works. In Proceedings of Thirty Third Conference on Learning Theory (09–12 Jul 2020), J. Abernethy and S. Agarwal, Eds., vol. 125 of Proceedings of Machine Learn- ing Research, PMLR, pp. 2306–2327

work page 2020

[67] [67]

M., and Barron, A

Klusowski, J. M., and Barron, A. R. Approximation by combinations of ReLU and squared ReLU ridge functions with ℓ1 and ℓ0 controls. IEEE Trans. Inform. Theory 64 , 12 (2018), 7649–7656

work page 2018

[68] [68]

Comparison of worst case errors in linear and neural network approximation

K˚urkov´a, V., and Sanguineti, M. Comparison of worst case errors in linear and neural network approximation. IEEE Trans. Inform. Theory 48 , 1 (2002), 264–275

work page 2002

[69] [69]

Geometric upper bounds on rates of variable- basis approximation

K˚urkov´a, V., and Sanguineti, M. Geometric upper bounds on rates of variable- basis approximation. IEEE Trans. Inform. Theory 54 , 12 (2008), 5681–5688

work page 2008

[70] [70]

C., and Kreinovich, V

K˚urkov´a, V., Kainen, P. C., and Kreinovich, V. Estimates of the Number of Hidden Units and Variation with Respect to Half-Spaces. Neural Netw. 10 , 6 (1997), 1061–1068

work page 1997

[71] [71]

On the geometric convergence of neural approximations

Lavretsky, E. On the geometric convergence of neural approximations. IEEE Trans. on Neural Networks 13 , 2 (2002), 274–282

work page 2002

[72] [72]

On the ability of neural nets to express distributions

Lee, H., Ge, R., Ma, T., Risteski, A., and Arora, S. On the ability of neural nets to express distributions. In Proceedings of the 2017 Conference on Learning Theory (07–10 Jul 2017), S. Kale and O. Shamir, Eds., vol. 65 of Proceedings of Machine Learning Research, PMLR, pp. 1271–1296

work page 2017

[73] [73]

Y., Pinkus, A., and Schocken, S

Leshno, M., Lin, V. Y., Pinkus, A., and Schocken, S. Multilayer feed- forward networks with a nonpolynomial activation function can app roximate any function. Neural Netw. 6 , 6 (1993), 861–867

work page 1993

[74] [74]

Better approximations of high dimensional smooth functions by deep neural networks with rectiﬁed power units

Li, B., Tang, S., and Yu, H. Better approximations of high dimensional smooth functions by deep neural networks with rectiﬁed power units. Commun. Comput. Phys. 27 , 2 (2020), 379–411

work page 2020

[75] [75]

A note on the expressive power of deep rectiﬁed linear unit networks in high-dimensional spaces

Liang, C., and Wu, C. A note on the expressive power of deep rectiﬁed linear unit networks in high-dimensional spaces. Math. Methods Appl. Sci. 42 , 9 (2019), 3400–3404. 79

work page 2019

[76] [76]

Deep network approximation for smooth functions

Lu, J., Shen, Z., Yang, H., and Zhang, S. Deep network approximation for smooth functions. arXiv2001.03040 (2020), 46 pages

work page arXiv 2020

[77] [77]

Lower bounds for approximation by MLP neural networks

Maiorov, V., and Pinkus, A. Lower bounds for approximation by MLP neural networks. Neurocomputing 25, 1 (1999), 81–91

work page 1999

[78] [78]

E., and Meir, R

Maiorov, V. E., and Meir, R. On the near optimality of the stochastic approx- imation of smooth functions by neural networks. Adv. Comput. Math. 13 , 1 (2000), 79–103

work page 2000

[79] [79]

Random approximants and neural networks

Makovoz, Y. Random approximants and neural networks. J. Approx. Theory 85 , 1 (1996), 98–109

work page 1996

[80] [80]

Uniform approximation by neural networks

Makovoz, Y. Uniform approximation by neural networks. J. Approx. Theory 95 , 2 (1998), 215–228

work page 1998