Recognition: 2 theorem links
· Lean TheoremAdaptive Randomized Neural Networks with Locally Activation Function: Theory and Algorithm for Solving PDEs
Pith reviewed 2026-05-10 17:56 UTC · model grok-4.3
The pith
Randomized neural networks achieve optimal approximation when the hidden-parameter sampling domain is sized to match the target function's smoothness and the number of neurons.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors establish that for networks of the form sum W_i sigma(A_i, b_i) with uniform sampling of (A_i, b_i) from a prescribed bounded domain, optimal approximation rates require the domain size to scale with the smoothness of the target function and the network width. They then combine these networks with a partition of unity whose subdomains are refined adaptively by a posteriori error indicators, producing the adaptive PIRaNN scheme that solves PDEs whose solutions have limited local regularity without introducing additional consistency errors.
What carries the argument
The approximation theorem that relates the required size of the uniform sampling domain for hidden parameters in randomized neural networks to the smoothness of the target function and the number of neurons, together with a posteriori error-driven partition-of-unity refinement.
If this is right
- The adaptive PIRaNN method captures localized low-regularity features in PDE solutions by refining the partition of unity according to a posteriori indicators.
- The method maintains consistency because the refinement strategy does not add new approximation errors beyond those already controlled by the randomized network.
- Numerical benchmarks confirm both the theoretical dependence of domain size on smoothness and the practical performance of the adaptive scheme on standard test problems.
- The approach extends the use of randomized networks from globally smooth to locally irregular PDE solutions while keeping the number of neurons moderate.
Where Pith is reading between the lines
- The same domain-size tuning principle could be tested on other randomized approximation schemes outside neural networks to see whether it yields similar rate improvements.
- Applying the adaptive partition-of-unity idea to time-dependent or high-dimensional PDEs would test whether the error-driven refinement remains computationally efficient as dimension grows.
- If the load-bearing assumption holds, one could replace the uniform sampling step with other simple distributions and still obtain the same link between domain size and smoothness.
Load-bearing premise
Uniform sampling of hidden-layer parameters from one fixed bounded domain plus error-driven partition-of-unity refinement is enough to resolve localized low-regularity features without creating new consistency errors.
What would settle it
A numerical test that measures whether the observed approximation or PDE-solution error stops improving at the predicted optimal rate once the sampling domain size is deliberately mismatched to the smoothness and neuron count, or once the adaptive refinement is removed.
read the original abstract
This paper establishes an approximation theorem for randomized neural networks (RaNNs) whose hidden-layer parameters are uniformly sampled from a prescribed bounded domain. Our analysis shows that, for RaNNs of the form $\mathop{\sum}_i W_i \sigma(A_i, b_i)$, the size of the sampling domain required to achieve optimal approximation is intrinsically linked to the smoothness of the target function and the number of neurons. Motivated by this theoretical insight, we integrate a partition of unity (PoU) with RaNNs to develop an adaptive physics-informed randomized neural network (PIRaNN) method for solving partial differential equations with limited local regularity. The proposed adaptive strategy refines the PoU based on a posteriori error indicators, enabling the network to efficiently capture localized solution features. Numerical experiments validate the theoretical results and demonstrate the strong approximation capabilities of RaNNs, confirming the effectiveness of the adaptive PIRaNN method on a range of benchmark problems.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript establishes an approximation theorem for randomized neural networks (RaNNs) with hidden-layer parameters uniformly sampled from a bounded domain, showing that the required sampling-domain size is linked to the target function's smoothness and the number of neurons. It then proposes an adaptive physics-informed RaNN (PIRaNN) method that integrates a partition of unity (PoU), refines patches via a posteriori error indicators, and solves PDEs with localized low regularity; numerical experiments on benchmark problems are used to support the claims.
Significance. If the approximation theorem holds and the adaptive PoU construction is shown to preserve optimal rates, the work would supply a theoretically motivated adaptive framework for neural solvers of PDEs with singularities or reduced regularity, with the domain-size/smoothness link offering practical guidance for parameter choice. The numerical validation on benchmarks is a positive indicator, but the absence of quantitative rate comparisons limits the assessed impact.
major comments (2)
- [§3] §3 (approximation theorem): the result is stated for a single global RaNN with fixed bounded sampling domain; the subsequent adaptive PIRaNN construction assigns independent RaNNs to a posteriori-refined patches with possibly different local sampling domains, yet no error-propagation argument is supplied showing that the sum of local approximation errors remains controlled by the same smoothness-dependent constants.
- [§4.2] §4.2 (adaptive PIRaNN algorithm): the claim that the method captures localized low-regularity features 'without introducing new consistency errors' rests on the assumption that PoU weighting and per-patch adaptive sampling commute with the randomization argument; a concrete global error bound or proof sketch verifying that each local RaNN satisfies the theorem hypotheses at every refinement step is required.
minor comments (2)
- [Abstract] Abstract: the statement that 'numerical experiments validate the theoretical results' is vague; a single sentence summarizing observed convergence rates or error magnitudes relative to theory would strengthen the claim.
- [Notation] Notation: the RaNN form ∑_i W_i σ(A_i, b_i) is introduced without an immediate reminder of the precise definitions of the random matrices A_i and vectors b_i; adding a short parenthetical or reference to the earlier definition would improve readability.
Simulated Author's Rebuttal
We thank the referee for the careful reading and constructive comments. We address the major comments point by point below, agreeing that the connection between the global approximation theorem and the adaptive construction requires explicit justification. We will strengthen the manuscript accordingly.
read point-by-point responses
-
Referee: [§3] §3 (approximation theorem): the result is stated for a single global RaNN with fixed bounded sampling domain; the subsequent adaptive PIRaNN construction assigns independent RaNNs to a posteriori-refined patches with possibly different local sampling domains, yet no error-propagation argument is supplied showing that the sum of local approximation errors remains controlled by the same smoothness-dependent constants.
Authors: The referee is correct that Theorem 3.1 is stated for a single global RaNN. The adaptive PIRaNN employs a partition of unity to localize the approximation, with each patch using its own RaNN whose sampling domain is sized according to the local regularity. Because the PoU functions are smooth, non-negative, and sum to one, the global L2 error is bounded by a sum of the local errors (with a multiplicative constant depending only on the PoU). We will insert a new proposition after Theorem 3.1 that makes this propagation explicit, showing that the smoothness-dependent constants from the theorem carry over to each local approximant when the sampling domain is chosen adaptively per patch. This addition will confirm that the global error remains controlled without inflation. revision: yes
-
Referee: [§4.2] §4.2 (adaptive PIRaNN algorithm): the claim that the method captures localized low-regularity features 'without introducing new consistency errors' rests on the assumption that PoU weighting and per-patch adaptive sampling commute with the randomization argument; a concrete global error bound or proof sketch verifying that each local RaNN satisfies the theorem hypotheses at every refinement step is required.
Authors: We acknowledge that the current text does not supply a self-contained verification that the local randomization hypotheses remain satisfied after each refinement. The PoU weights are independent of the random parameters and the adaptive choice of sampling domain is made from a posteriori indicators that estimate local smoothness; thus the local problems continue to meet the hypotheses of the theorem. We will add a short proof sketch in §4.2 that (i) confirms each local RaNN at every step satisfies the uniform-sampling assumption with a domain sized to the local regularity, and (ii) assembles the local bounds into a global a-priori error estimate that contains no extra consistency terms arising from the PoU or the adaptation process. This will rigorously justify the claim. revision: yes
Circularity Check
No circularity detected; approximation theorem and adaptive construction remain independent of self-referential inputs.
full rationale
The paper first states an approximation theorem for RaNNs that links the required sampling-domain diameter to the target function's smoothness and the number of neurons; this is presented as a derived result from analysis of the form ∑ W_i σ(A_i, b_i) with uniform sampling from a bounded domain. The subsequent adaptive PIRaNN construction with a posteriori PoU refinement is motivated by that theorem but does not redefine any quantity in terms of itself, fit a parameter and relabel it a prediction, or rely on a load-bearing self-citation whose content is unverified. No equation reduces the claimed global error bound to a fitted constant or to the adaptive choice itself by construction. The derivation chain is therefore self-contained and does not exhibit any of the enumerated circularity patterns.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Uniform sampling of hidden-layer parameters from a bounded domain yields approximation rates governed by the smoothness of the target and the number of neurons.
- domain assumption A posteriori error indicators computed from the current network solution can be used to refine the partition of unity without destroying consistency.
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclearthe size of the sampling domain required to achieve optimal approximation is intrinsically linked to the smoothness of the target function and the number of neurons... M = O(N^{p/2[(p-1)(d+1)+pη]})
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclearintegrate a partition of unity (PoU) with RaNNs... adaptive strategy refines the PoU based on a posteriori error indicators
Reference graph
Works this paper leans on
-
[1]
R. A. Adams and J. J. Fournier , Sobolev spaces, vol. 140, Elsevier, 2003
2003
-
[2]
A. R. Barron , Universal approximation bounds for superpositions of a sig moidal function , IEEE Transactions on Information theory, 39 (2002), pp. 930 –945
2002
- [3]
-
[4]
J. M. Cascon, C. Kreuzer, R. H. Nochetto, and K. G. Siebert , Quasi-optimal convergence rate for an adaptive finite element method , SIAM Journal on Numerical Analysis, 46 (2008), pp. 2524–2550
2008
-
[5]
J. Chen, X. Chi, W. E, and Z. Yang , Bridging traditional and machine learning-based algo- rithms for solving pdes: the random feature method , J Mach Learn, 1 (2022), pp. 268–298
2022
-
[6]
S. M. Cox and P. C. Matthews , Exponential time differencing for stiff systems , Journal of Computational Physics, 176 (2002), pp. 430–455
2002
-
[7]
De Ryck, S
T. De Ryck, S. Lanthaler, and S. Mishra , On the approximation of functions by tanh neural networks, Neural Networks, 143 (2021), pp. 732–750
2021
-
[8]
T. De Ryck, S. Mishra, Y. Shang, and F. W ang , Approximation theory and applica- tions of randomized neural networks for solving high-dimen sional pdes , arXiv preprint arXiv:2501.12145, (2025)
-
[9]
Dong and Z
S. Dong and Z. Li , Local extreme learning machines and domain decomposition f or solving lin- ear and nonlinear partial differential equations , Computer Methods in Applied Mechanics and Engineering, 387 (2021), p. 114129
2021
-
[10]
D ¨orfler, A convergent adaptive algorithm for poisson ’s equation , SIAM Journal on Nu- merical Analysis, 33 (1996), pp
W. D ¨orfler, A convergent adaptive algorithm for poisson ’s equation , SIAM Journal on Nu- merical Analysis, 33 (1996), pp. 1106–1124
1996
-
[11]
T. A. Driscoll, N. Hale, and L. N. Trefethen , Chebfun guide , 2014
2014
-
[12]
Dwivedi and B
V. Dwivedi and B. Srinivasan , Physics informed extreme learning machine (pielm)–a rapid method for the numerical solution of partial differential eq uations, Neurocomputing, 391 (2020), pp. 96–118
2020
-
[13]
Ellacott , Aspects of the numerical analysis of neural networks , Acta numerica, 3 (1994), pp
S. Ellacott , Aspects of the numerical analysis of neural networks , Acta numerica, 3 (1994), pp. 145–202
1994
-
[14]
L. C. Evans , Partial differential equations , vol. 19, American mathematical society, 2022
2022
-
[15]
G. B. Folland , Real analysis: modern techniques and their applications , John Wiley & Sons, 1999
1999
-
[16]
Gonon , Random feature neural networks learn black-scholes type pd es without curse of dimensionality, Journal of Machine Learning Research, 24 (2023), pp
L. Gonon , Random feature neural networks learn black-scholes type pd es without curse of dimensionality, Journal of Machine Learning Research, 24 (2023), pp. 1–51
2023
-
[17]
G ¨uhring and M
I. G ¨uhring and M. Raslan , Approximation rates for neural networks with encodable wei ghts in smoothness spaces , Neural Networks, 134 (2021), pp. 107–130
2021
-
[18]
Hu, T.-S
W.-F. Hu, T.-S. Lin, and M.-C. Lai , A discontinuity capturing shallow neural network for elliptic interface problems , Journal of Computational Physics, 469 (2022), p. 111576
2022
-
[19]
Huang, Q.-Y
G.-B. Huang, Q.-Y. Zhu, and C.-K. Siew , Extreme learning machine: theory and applica- tions, Neurocomputing, 70 (2006), pp. 489–501
2006
-
[20]
A. D. Jagtap, K. Kaw aguchi, and G. E. Karniadakis , Adaptive activation functions acceler- ate convergence in deep and physics-informed neural networ ks, Journal of Computational Physics, 404 (2020), p. 109136
2020
-
[21]
O. A. Karakashian and F. Pascal , Convergence of adaptive discontinuous galerkin approxi- mations of second-order elliptic problems , SIAM Journal on Numerical Analysis, 45 (2007), pp. 641–665
2007
- [22]
- [23]
-
[24]
Z. Li, N. Kovachki, K. Azizzadenesheli, B. Liu, K. Bhattachary a, A. Stuart, and A. Anandkumar , Fourier neural operator for parametric partial differentia l equations , arXiv preprint arXiv:2010.08895, (2020)
work page internal anchor Pith review arXiv 2010
- [25]
-
[26]
J. Lu, Z. Shen, H. Yang, and S. Zhang , Deep network approximation for smooth functions , SIAM Journal on Mathematical Analysis, 53 (2021), pp. 5465– 5506. 30 RAN BI AND WEIBING DENG
2021
-
[27]
L. Lu, P. Jin, G. Pang, Z. Zhang, and G. E. Karniadakis , Learning nonlinear operators via deeponet based on the universal approximation theorem o f operators, Nature machine intelligence, 3 (2021), pp. 218–229
2021
-
[28]
A. Neufeld and P. Schmocker , Universal approximation property of random neural network s, arXiv preprint arXiv:2312.08410, (2023)
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[29]
Raissi, P
M. Raissi, P. Perdikaris, and G. E. Karniadakis , Physics-informed neural networks: A deep learning framework for solving forward and inverse problem s involving nonlinear partial differential equations , Journal of Computational physics, 378 (2019), pp. 686–707
2019
-
[30]
P. Rathore, W. Lei, Z. Frangella, L. Lu, and M. Udell , Challenges in training pinns: A loss landscape perspective , arXiv preprint arXiv:2402.01868, (2024)
-
[31]
J. W. Siegel and J. Xu , Approximation rates for neural networks with general activ ation functions, Neural Networks, 128 (2020), pp. 313–321
2020
-
[32]
J. W. Siegel and J. Xu , High-order approximation rates for shallow neural network s with cosine and reluk activation functions , Applied and Computational Harmonic Analysis, 58 (2022), pp. 1–26
2022
-
[33]
J. W. Siegel and J. Xu , Sharp bounds on the approximation rates, metric entropy, an d n- widths of shallow neural networks , Foundations of Computational Mathematics, 24 (2024), pp. 481–537
2024
-
[34]
Gradient alignment in physics- informed neural networks: a second-order optimization perspective
S. W ang, A. K. Bhartari, B. Li, and P. Perdikaris , Gradient alignment in physics- informed neural networks: A second-order optimization per spective, arXiv preprint arXiv:2502.00604, (2025)
-
[35]
W ang, Y
S. W ang, Y. Teng, and P. Perdikaris , Understanding and mitigating gradient flow patholo- gies in physics-informed neural networks , SIAM Journal on Scientific Computing, 43 (2021), pp. A3055–A3081
2021
-
[36]
W ang, X
S. W ang, X. Yu, and P. Perdikaris , When and why pinns fail to train: A neural tangent kernel perspective, Journal of Computational Physics, 449 (2022), p. 110768
2022
- [37]
-
[38]
Weinan, C
E. Weinan, C. Ma, and L. Wu , The barron space and the flow-induced function spaces for neural network models , Constructive Approximation, 55 (2022), pp. 369–406
2022
-
[39]
Xu , The finite neuron method and convergence analysis , arXiv preprint arXiv:2010.01458, (2020)
J. Xu , The finite neuron method and convergence analysis , arXiv preprint arXiv:2010.01458, (2020)
- [40]
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.