pith. sign in

arxiv: 2112.14523 · v2 · submitted 2021-12-29 · 🧮 math.NA · cs.NA

Deep neural network approximation theory for high-dimensional functions

Pith reviewed 2026-05-24 12:34 UTC · model grok-4.3

classification 🧮 math.NA cs.NA
keywords deep neural networksapproximation theoryhigh-dimensional functionscurse of dimensionalityexpressive powerfunction compositionlocally Lipschitz functionsmaximum and product functions
0
0 comments X

The pith

Deep neural networks can approximate high-dimensional functions composed of locally Lipschitz functions, maxima, and products using polynomially many parameters in the dimension and error.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a framework to analyze the approximation power of deep neural networks for high-dimensional functions. It introduces approximation spaces consisting of sequences of functions that can be built from locally Lipschitz continuous functions, maxima, and products through finite compositions. The authors prove that these spaces are closed under the relevant operations and that DNNs can approximate functions in these spaces with a number of parameters that grows at most polynomially in the input dimension and the reciprocal of the approximation error. This establishes that DNNs have sufficient expressive power to overcome the curse of dimensionality for this class of functions on compact sets.

Core claim

DNNs have sufficient expressive power to approximate, without the curse of dimensionality, certain sequences of functions which can be constructed by means of a finite number of compositions using locally Lipschitz continuous functions, maxima, and products. The number of parameters necessary to represent the approximating DNNs grows at most polynomially in 1/ε and in the input dimension d.

What carries the argument

Approximation spaces of function sequences that are closed under finite compositions with locally Lipschitz continuous functions, maxima, and products, allowing the combination of DNN approximation bounds for the individual operations.

If this is right

  • The parameter count for DNN approximations of such functions scales polynomially rather than exponentially with dimension.
  • Approximations hold on compact sets for any prescribed error ε > 0.
  • The result combines existing bounds for basic operations to cover their compositions.
  • Functions in these spaces can be approximated efficiently by DNNs without the curse of dimensionality.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • This suggests a method to certify efficient approximability by checking if a given function can be expressed through such compositions.
  • It may be possible to enlarge the class by including other operations that admit similar polynomial bounds.
  • High-dimensional problems in applications could be addressed if their solutions fall into these approximation spaces.

Load-bearing premise

The functions belong to the closure of the basic operations under composition and the parameter bounds for the basic operations extend to the composed functions without introducing exponential factors in the number of compositions.

What would settle it

Constructing a sequence of functions using only the allowed operations for which the minimal number of DNN parameters needed to achieve error ε grows exponentially with the dimension d.

Figures

Figures reproduced from arXiv: 2112.14523 by Arnulf Jentzen, Benno Kuckuck, Patrick Cheridito, Pierfrancesco Beneventano, Robin Graeber.

Figure 1
Figure 1. Figure 1: Graphical illustration of an example neural network which h [PITH_FULL_IMAGE:figures/full_fig_p007_1.png] view at source ↗
read the original abstract

The purpose of this article is to develop a machinery to study the capacity of deep neural networks (DNNs) to approximate high-dimensional functions. In particular, we show that DNNs have the expressive power to overcome the curse of dimensionality in the approximation of a large class of functions. More precisely, we prove that these functions can be approximated by DNNs on compact sets such that the number of parameters necessary to represent the approximating DNNs grows at most polynomially in the reciprocal $1/\varepsilon$ of the prescribed approximation error $\varepsilon>0$ and in the input dimension $d\in\mathbb N$. To this end, we introduce certain approximation spaces, consisting of sequences of functions that can be efficiently approximated by DNNs. We then establish closure properties which we combine with known and new bounds on the number of parameters necessary to approximate locally Lipschitz continuous functions, maximum functions, and product functions by DNNs. The main result of this article demonstrates that DNNs have sufficient expressive power to approximate, without the curse of dimensionality, certain sequences of functions which can be constructed by means of a finite number of compositions using locally Lipschitz continuous functions, maxima, and products.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 3 minor

Summary. The paper introduces approximation spaces consisting of sequences of functions that can be constructed via a finite number of compositions of locally Lipschitz continuous functions, maxima, and products. It proves closure properties of these spaces under the listed operations and combines them with explicit bounds (new and previously published) on the number of DNN parameters required to approximate the basic operations, establishing that the resulting DNN approximants have parameter counts that scale at most polynomially in both the input dimension d and the reciprocal error 1/ε (with the polynomial degree permitted to depend on the fixed construction depth).

Significance. If the central claims hold, the work supplies a concrete, parameter-counting framework that identifies a broad class of high-dimensional functions approximable by DNNs without the curse of dimensionality. The explicit tracking of parameter counts through the closure operations, together with the combination of new and existing bounds on the elementary operations, constitutes a useful addition to the approximation theory of neural networks.

minor comments (3)
  1. The abstract states the main theorem but does not indicate where the explicit parameter bounds for the basic operations (locally Lipschitz, max, product) are proved or referenced; a pointer to the relevant lemmas or prior works in the introduction would improve readability.
  2. Notation for the approximation spaces (e.g., how the sequences are indexed and how the closure is formally defined) is introduced only in the body; a short definitional paragraph or diagram in §1 would help readers track the construction depth.
  3. The dependence of the polynomial degree on the (fixed) construction depth is stated in the abstract but should be restated explicitly when the main theorem is formulated, to avoid any ambiguity about uniformity in depth.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive assessment of the manuscript, the clear summary of its contributions, and the recommendation for minor revision. No specific major comments appear in the provided report, so we have no individual points requiring rebuttal or clarification at this stage. We will proceed with preparing a revised version incorporating any minor editorial suggestions that may arise during the revision process.

Circularity Check

0 steps flagged

No significant circularity

full rationale

The derivation introduces approximation spaces via explicit closure under locally Lipschitz functions, maxima, and products, then tracks parameter counts through these operations to obtain polynomial bounds in 1/ε and d. Bounds on the base operations are stated as either previously known or newly derived within the paper, with the closure lemmas providing an independent counting argument that does not reduce any final bound to a fitted quantity or to a self-referential definition. The central claim therefore rests on the explicit construction and parameter tracking rather than on any circular reduction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review; no explicit free parameters, axioms, or invented entities are visible. The central claim rests on the existence of the approximation spaces and the validity of the cited bounds for basic operations.

pith-pipeline@v0.9.0 · 5751 in / 1083 out tokens · 19919 ms · 2026-05-24T12:34:05.874356+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

103 extracted references · 103 canonical work pages · 3 internal anchors

  1. [1]

    L., Romero-L ´opez, D., and Voigtlaender, F

    Almira, J., de Teruel, P. L., Romero-L ´opez, D., and Voigtlaender, F. Negative results for approximation using single layer and multilayer fe edforward neural networks. J. Math. Anal. Appl. 494 , 1 (2021), 124584. Early access version available online

  2. [2]

    Breaking the curse of dimensionality with convex neural networks

    Bach, F. Breaking the curse of dimensionality with convex neural networks. J. Mach. Learn. Res. 18 , 19 (2017), 53 pages

  3. [3]

    Neural net approximation

    Barron, A. Neural net approximation. In Proceedings of the 7th Yale Workshop on Adaptive and Learning Systems (1992), pp. 69–72

  4. [4]

    Barron, A. R. Universal approximation bounds for superpositions of a sigmoidal function. IEEE Trans. Inform. Theory 39 , 3 (1993), 930–945

  5. [5]

    Barron, A. R. Approximation and estimation bounds for artificial neural net- works. Mach. Learn. 14 , 1 (1994), 115–133. 73

  6. [6]

    Deep Splitting Method for Parabolic PDEs

    Beck, C., Becker, S., Cheridito, P., Jentzen, A., and Neufeld, A. Deep Splitting Method for Parabolic PDEs. SIAM J. Sci. Comput. 43 , 5 (2021), A3135– A3154

  7. [7]

    An overview on deep learning-based approximation methods for partial differen tial equations

    Beck, C., Hutzenthaler, M., Jentzen, A., and Kuckuck, B. An overview on deep learning-based approximation methods for partial differen tial equations. arXiv:2012.12348 (2020), 22 pages. Revision requested from Discrete Contin. Dyn. Syst. Ser. B

  8. [8]

    Overall error analysis for the training of deep neural netwo rks via stochastic gradient descent with random initialisation

    Beck, C., Jentzen, A., and Kuckuck, B. Full error analysis for the training of deep neural networks. arXiv:1910.00121 (2019), 43 pages. To appear in Infin. Dimens. Anal. Quantum Probab. Relat. Top

  9. [9]

    Dynamic Programming

    Bellman, R. Dynamic Programming . Princeton Landmarks in Mathematics. Princeton University Press, Princeton, NJ, 2010. Reprint of the 1 957 edition

  10. [10]

    High-dimensional approximation spaces of artificial neural networ ks and appli- cations to partial differential equations

    Beneventano, P., Cheridito, P., Jentzen, A., and von Wurstem berger, P. High-dimensional approximation spaces of artificial neural networ ks and appli- cations to partial differential equations. arXiv:2012.04326 (2020), 32 pages

  11. [11]

    Berner, J., Grohs, P., and Jentzen, A. Analysis of the generalization error: empirical risk minimization over deep artificial neural networks over comes the curse of dimensionality in the numerical approximation of Black-Scholes par tial differen- tial equations. SIAM J. Math. Data Sci. 2 , 3 (2020), 631–657

  12. [12]

    K., and Li, L

    Blum, E. K., and Li, L. K. Approximation theory and feedforward networks. Neural Netw. 4 , 4 (1991), 511–515

  13. [13]

    Optimal ap- proximation with sparsely connected deep neural networks

    B¨olcskei, H., Grohs, P., Kutyniok, G., and Petersen, P. Optimal ap- proximation with sparsely connected deep neural networks. SIAM J. Math. Data Sci. 1 , 1 (2019), 8–45

  14. [14]

    Hinging hyperplanes for regression, classification, and function ap - proximation

    Breiman, L. Hinging hyperplanes for regression, classification, and function ap - proximation. IEEE Trans. Inform. Theory 39 , 3 (1993), 999–1013

  15. [15]

    Error bounds for approximation with neural networks

    Burger, M., and Neubauer, A. Error bounds for approximation with neural networks. J. Approx. Theory 112 , 2 (2001), 235–250

  16. [16]

    Candes, E. J. Ridgelets: Theory and applications . ProQuest LLC, Ann Arbor, MI, 1998. Ph.D. Thesis, Stanford University

  17. [17]

    Neural network approx- imation and estimation of classifiers with classification boundary in a Ba rron class

    Caragea, A., Petersen, P., and Voigtlaender, F. Neural network approx- imation and estimation of classifiers with classification boundary in a Ba rron class. arXiv:2011.09363 (2020), 39 pages. 74

  18. [18]

    Construction of neural nets using the radon transform

    Carroll, and Dickinson . Construction of neural nets using the radon transform. In International 1989 Joint Conference on Neural Networks (1989), vol. 1, pp. 607– 611

  19. [19]

    Approximation capability to functions of several vari- ables, nonlinear functionals, and operators by radial basis functio n neural networks

    Chen, T., and Chen, H. Approximation capability to functions of several vari- ables, nonlinear functionals, and operators by radial basis functio n neural networks. IEEE Trans. on Neural Networks 6 , 4 (1995), 904–910

  20. [20]

    Efficient approximation of high-dimensional functions with neural networks

    Cheridito, P., Jentzen, A., and Rossmannek, F. Efficient approximation of high-dimensional functions with neural networks. IEEE Trans. Neural Netw. Learn. Syst. (2021), 15 pages. Early access version available online

  21. [21]

    K., Li, X., and Mhaskar, H

    Chui, C. K., Li, X., and Mhaskar, H. N. Neural networks for localized approximation. Math. Comp. 63 , 208 (1994), 607–623

  22. [22]

    K., Lin, S.-B., and Zhou, D.-X

    Chui, C. K., Lin, S.-B., and Zhou, D.-X. Deep neural networks for rotation- invariance approximation and learning. Anal. Appl. (Singap.) 17 , 5 (2019), 737–772

  23. [23]

    On the expressive power of deep learning: A tensor analysis

    Cohen, N., Sharir, O., and Shashua, A. On the expressive power of deep learning: A tensor analysis. In 29th Annual Conference on Learning Theory (23–26 Jun 2016), V. Feldman, A. Rakhlin, and O. Shamir, Eds., vol. 49 of Proceedings of Machine Learning Research, PMLR, pp. 698–728

  24. [24]

    Approximation by superpositions of a sigmoidal function

    Cybenko, G. Approximation by superpositions of a sigmoidal function. Math. Control Signals Systems 2 , 4 (1989), 303–314

  25. [25]

    Depth separation for neural networks

    Daniely, A. Depth separation for neural networks. In Proceedings of the 2017 Conference on Learning Theory (07–10 Jul 2017), S. Kale and O. Shamir, Eds., vol. 65 of Proceedings of Machine Learning Research , PMLR, pp. 690–696

  26. [26]

    A., Oskolkov, K

    DeVore, R. A., Oskolkov, K. I., and Petrushev, P. P. Approximation by feed-forward neural networks. Ann. Numer. Math. 4 , 1–4 (1997), 261–287. The heritage of P. L. Chebyshev: a Festschrift in honor of the 70th bir thday of T. J. Rivlin

  27. [27]

    J., Gurvits, L., Darken, C., and Sontag, E

    Donahue, M. J., Gurvits, L., Darken, C., and Sontag, E. Rates of convex approximation in non-Hilbert spaces. Constr. Approx. 13 , 2 (1997), 187–220

  28. [28]

    Deep learning-based numerical methods for high-dimensional parabolic partial differential equations and backw ard stochastic differential equations

    E, W., Han, J., and Jentzen, A. Deep learning-based numerical methods for high-dimensional parabolic partial differential equations and backw ard stochastic differential equations. Commun. Math. Stat. 5 , 4 (2017), 349–380

  29. [29]

    Algorithms for solving high dimensional PDEs: from nonlinear monte carlo to machine learning

    E, W., Han, J., and Jentzen, A. Algorithms for solving high dimensional PDEs: from nonlinear monte carlo to machine learning. Nonlinearity 35 , 1 (2021), 278–310. 75

  30. [30]

    Exponential convergence of the deep neural network approximation for analytic functions

    E, W., and W ang, Q. Exponential convergence of the deep neural network approximation for analytic functions. Science China Mathematics 61 , 10 (2018), 1733–1740

  31. [31]

    DNN expression rate analysis of high-dimensional PDEs: Application to option pricing

    Elbr¨achter, D., Grohs, P., Jentzen, A., and Schwab, C. DNN expression rate analysis of high-dimensional PDEs: Application to option pricing. Constr. Approx. (2021), 69 pages. Early access version available online

  32. [32]

    Deep neural network approximation theory

    Elbr¨achter, D., Perekrestenko, D., Grohs, P., and B ¨olcskei, H. Deep neural network approximation theory. IEEE Trans. Inform. Theory 67 , 5 (2021), 2581–2623

  33. [33]

    The power of depth for feedforward neural net- works

    Eldan, R., and Shamir, O. The power of depth for feedforward neural net- works. In 29th Annual Conference on Learning Theory (23–26 Jun 2016), V. Feld- man, A. Rakhlin, and O. Shamir, Eds., vol. 49 of Proceedings of Machine Learning Research, PMLR, pp. 907–940

  34. [34]

    Ellacott, S. W. Aspects of the numerical analysis of neural networks. Acta Numer. 3 (1994), 145–202

  35. [35]

    On the approximate realization of continuous mappings by neural networks

    Funahashi, K.-I. On the approximate realization of continuous mappings by neural networks. Neural Netw. 2 , 3 (1989), 183–192

  36. [36]

    There exists a neural network that does not make avoid- able mistakes

    Gallant, and White . There exists a neural network that does not make avoid- able mistakes. In IEEE 1988 International Conference on Neural Networks (1988), vol. 1, pp. 657–664

  37. [37]

    Uniform error estimates for artificial neural network approximations for heat equations

    Gonon, L., Grohs, P., Jentzen, A., Kofler, D., and ˇSiˇska, D. Uniform error estimates for artificial neural network approximations for heat equations. IMA J. Numer. Anal. (2021), drab027. Early access version available online

  38. [38]

    Deep ReLU network expression rates for option prices in high-dimensional, exponential L´ evy models

    Gonon, L., and Schwab, C. Deep ReLU network expression rates for option prices in high-dimensional, exponential L´ evy models. Tech. Rep. 20 20-52, Seminar for Applied Mathematics, ETH Z¨ urich, Switzerland, 2020

  39. [39]

    Ap- proximation spaces of deep neural networks

    Gribonval, R., Kutyniok, G., Nielsen, M., and Voigtlaender, F. Ap- proximation spaces of deep neural networks. arXiv:1905.01208 (2019), 63 pages

  40. [40]

    Deep neural network approximation for high- dimensional elliptic PDEs with boundary conditions

    Grohs, P., and Herrmann, L. Deep neural network approximation for high- dimensional elliptic PDEs with boundary conditions. arXiv:2007.05384 (2020), 22 pages

  41. [41]

    A proof that artificial neural networks overcome the curse of dim ensionality 76 in the numerical approximation of Black–Scholes partial differential equations

    Grohs, P., Hornung, F., Jentzen, A., and Von Wurstemberger, P. A proof that artificial neural networks overcome the curse of dim ensionality 76 in the numerical approximation of Black–Scholes partial differential equations. arXiv:1809.02362 (2018), 124 pages. To appear in Mem. Amer. Math. Soc

  42. [42]

    Space-time error estimates for deep neural network approximations for diffe rential equations

    Grohs, P., Hornung, F., Jentzen, A., and Zimmermann, P. Space-time error estimates for deep neural network approximations for diffe rential equations. arXiv:1908.03833 (2019), 86 pages. Revision requested from Adv. Comput. Math

  43. [43]

    Lower bounds for artificial neural network approximations: A proof tha t shallow neural networks fail to overcome the curse of dimensionality

    Grohs, P., Ibragimov, S., Jentzen, A., and Koppensteiner, S. Lower bounds for artificial neural network approximations: A proof tha t shallow neural networks fail to overcome the curse of dimensionality. arXiv:2103.04488 (2021), 53 pages. Revision requested from J. Complexity

  44. [44]

    Deep neural network approxima- tions for Monte Carlo algorithms

    Grohs, P., Jentzen, A., and Salimova, D. Deep neural network approxima- tions for Monte Carlo algorithms. arXiv:1908.10828 (2019), 45 pages. To appear in Partial Differ. Equ. Appl

  45. [45]

    Proof of the theory-to-practice gap in deep learning via sampling complexity bounds for neural network approxim ation spaces

    Grohs, P., and Voigtlaender, F. Proof of the theory-to-practice gap in deep learning via sampling complexity bounds for neural network approxim ation spaces. arXiv:2104.02746 (2021), 42 pages

  46. [46]

    Error bounds for approximations with deep ReLU neural networks in $W^{s,p}$ norms

    G¨uhring, I., Kutyniok, G., and Petersen, P. Error bounds for approxima- tions with deep ReLU neural networks in Ws,p norms. arXiv:1902.07896 (2019), 42 pages

  47. [47]

    Theory of Deep Learning

    G¨uhring, I., Raslan, M., and Kutyniok, G. Expressivity of Deep Neural Networks. arXiv:2007.04759 (2020), 37 pages. To appear as a chapter in the book “Theory of Deep Learning” by Cambridge University Press

  48. [48]

    J., and Ismailov, V

    Guliyev, N. J., and Ismailov, V. E. Approximation capability of two hidden layer feedforward neural networks with fixed weights. Neurocomputing 316 (2018), 262–269

  49. [49]

    J., and Ismailov, V

    Guliyev, N. J., and Ismailov, V. E. On the approximation by single hidden layer feedforward neural networks with fixed weights. Neural Netw. 98 (2018), 296–304

  50. [50]

    Solving high-dimensional partial differential equations using deep learning

    Han, J., Jentzen, A., and E, W. Solving high-dimensional partial differential equations using deep learning. Proc. Natl. Acad. Sci. USA 115 , 34 (2018), 8505– 8510

  51. [51]

    Universal function approximation by deep neural nets with bounded width and relu activations

    Hanin, B. Universal function approximation by deep neural nets with bounde d width and ReLU activations. arXiv:1708.02691 (2017), 9 pages

  52. [52]

    The randomized information complexity of elliptic PDE

    Heinrich, S. The randomized information complexity of elliptic PDE. J. Com- plexity 22 , 2 (2006), 220–249. 77

  53. [53]

    Monte Carlo complexity of parametric inte- gration

    Heinrich, S., and Sindambiwe, E. Monte Carlo complexity of parametric inte- gration. J. Complexity 15 , 3 (1999), 317–341

  54. [54]

    Approximation capabilities of multilayer feedforward networks

    Hornik, K. Approximation capabilities of multilayer feedforward networks. Neural Netw. 4 , 2 (1991), 251–257

  55. [55]

    Some new results on neural network approximation

    Hornik, K. Some new results on neural network approximation. Neural Netw. 6 , 8 (1993), 1069–1072

  56. [56]

    Multilayer feedforward net- works are universal approximators

    Hornik, K., Stinchcombe, M., and White, H. Multilayer feedforward net- works are universal approximators. Neural Netw. 2 , 5 (1989), 359–366

  57. [57]

    Universal approximation of an unknown mapping and its derivatives using multilayer feedforward ne tworks

    Hornik, K., Stinchcombe, M., and White, H. Universal approximation of an unknown mapping and its derivatives using multilayer feedforward ne tworks. Neural Netw. 3 , 5 (1990), 551–560

  58. [58]

    Space-time deep neu- ral network approximations for high-dimensional partial different ial equations

    Hornung, F., Jentzen, A., and Salimova, D. Space-time deep neu- ral network approximations for high-dimensional partial different ial equations. arXiv:2006.02199 (2020), 52 pages

  59. [59]

    Hutzenthaler, M., Jentzen, A., Kruse, T., and Nguyen, T. A. A proof that rectified deep neural networks overcome the curse of dimen sionality in the numerical approximation of semilinear heat equations. Partial Differ. Equ. Appl. 1 , 2 (2020), Paper No. 10, 34 pages

  60. [60]

    Capabilities of three-layered perceptrons

    Irie, and Miyake . Capabilities of three-layered perceptrons. In IEEE 1988 In- ternational Conference on Neural Networks (1988), vol. 1, pp. 641–648

  61. [61]

    Strong overall error analysis for the training of artificial neural networks via random initializations

    Jentzen, A., and Riekert, A. Strong overall error analysis for the training of artificial neural networks via random initializations. arXiv:2012.08443 (2020), 40 pages. Revision requested from Commun. Math. Stat

  62. [62]

    Jentzen, A., Salimova, D., and Welti, T. A proof that deep artificial neural networks overcome the curse of dimensionality in the numerical app roximation of Kolmogorov partial differential equations with constant diffusion an d nonlinear drift coefficients. Commun. Math. Sci. 19 , 5 (2021), 1167–1205

  63. [63]

    Jones, L. K. A simple lemma on greedy approximation in Hilbert space and convergence rates for projection pursuit regression and neura l network training. Ann. Statist. 20 , 1 (1992), 608–613

  64. [64]

    C., K ˚urkov´a, V., and Sanguineti, M

    Kainen, P. C., K ˚urkov´a, V., and Sanguineti, M. Complexity of Gaussian- radial-basis networks approximating smooth functions. J. Complexity 25 , 1 (2009), 63–74. 78

  65. [65]

    C., K ˚urkov´a, V., and Sanguineti, M

    Kainen, P. C., K ˚urkov´a, V., and Sanguineti, M. Dependence of computa- tional models on input dimension: tractability of approximation and op timization tasks. IEEE Trans. Inform. Theory 58 , 2 (2012), 1203–1214

  66. [66]

    Universal Approximation with Deep Narrow Net- works

    Kidger, P., and Lyons, T. Universal Approximation with Deep Narrow Net- works. In Proceedings of Thirty Third Conference on Learning Theory (09–12 Jul 2020), J. Abernethy and S. Agarwal, Eds., vol. 125 of Proceedings of Machine Learn- ing Research, PMLR, pp. 2306–2327

  67. [67]

    M., and Barron, A

    Klusowski, J. M., and Barron, A. R. Approximation by combinations of ReLU and squared ReLU ridge functions with ℓ1 and ℓ0 controls. IEEE Trans. Inform. Theory 64 , 12 (2018), 7649–7656

  68. [68]

    Comparison of worst case errors in linear and neural network approximation

    K˚urkov´a, V., and Sanguineti, M. Comparison of worst case errors in linear and neural network approximation. IEEE Trans. Inform. Theory 48 , 1 (2002), 264–275

  69. [69]

    Geometric upper bounds on rates of variable- basis approximation

    K˚urkov´a, V., and Sanguineti, M. Geometric upper bounds on rates of variable- basis approximation. IEEE Trans. Inform. Theory 54 , 12 (2008), 5681–5688

  70. [70]

    C., and Kreinovich, V

    K˚urkov´a, V., Kainen, P. C., and Kreinovich, V. Estimates of the Number of Hidden Units and Variation with Respect to Half-Spaces. Neural Netw. 10 , 6 (1997), 1061–1068

  71. [71]

    On the geometric convergence of neural approximations

    Lavretsky, E. On the geometric convergence of neural approximations. IEEE Trans. on Neural Networks 13 , 2 (2002), 274–282

  72. [72]

    On the ability of neural nets to express distributions

    Lee, H., Ge, R., Ma, T., Risteski, A., and Arora, S. On the ability of neural nets to express distributions. In Proceedings of the 2017 Conference on Learning Theory (07–10 Jul 2017), S. Kale and O. Shamir, Eds., vol. 65 of Proceedings of Machine Learning Research, PMLR, pp. 1271–1296

  73. [73]

    Y., Pinkus, A., and Schocken, S

    Leshno, M., Lin, V. Y., Pinkus, A., and Schocken, S. Multilayer feed- forward networks with a nonpolynomial activation function can app roximate any function. Neural Netw. 6 , 6 (1993), 861–867

  74. [74]

    Better approximations of high dimensional smooth functions by deep neural networks with rectified power units

    Li, B., Tang, S., and Yu, H. Better approximations of high dimensional smooth functions by deep neural networks with rectified power units. Commun. Comput. Phys. 27 , 2 (2020), 379–411

  75. [75]

    A note on the expressive power of deep rectified linear unit networks in high-dimensional spaces

    Liang, C., and Wu, C. A note on the expressive power of deep rectified linear unit networks in high-dimensional spaces. Math. Methods Appl. Sci. 42 , 9 (2019), 3400–3404. 79

  76. [76]

    Deep network approximation for smooth functions

    Lu, J., Shen, Z., Yang, H., and Zhang, S. Deep network approximation for smooth functions. arXiv2001.03040 (2020), 46 pages

  77. [77]

    Lower bounds for approximation by MLP neural networks

    Maiorov, V., and Pinkus, A. Lower bounds for approximation by MLP neural networks. Neurocomputing 25, 1 (1999), 81–91

  78. [78]

    E., and Meir, R

    Maiorov, V. E., and Meir, R. On the near optimality of the stochastic approx- imation of smooth functions by neural networks. Adv. Comput. Math. 13 , 1 (2000), 79–103

  79. [79]

    Random approximants and neural networks

    Makovoz, Y. Random approximants and neural networks. J. Approx. Theory 85 , 1 (1996), 98–109

  80. [80]

    Uniform approximation by neural networks

    Makovoz, Y. Uniform approximation by neural networks. J. Approx. Theory 95 , 2 (1998), 215–228

Showing first 80 references.