pith. machine review for the scientific record. sign in

arxiv: 2605.00308 · v1 · submitted 2026-05-01 · 🧮 math.NA · cs.NA

Recognition: unknown

Adaptive anisotropic composite quadratures for residual minimisation in neural PDE approximations

Authors on Pith no claims yet

Pith reviewed 2026-05-09 19:32 UTC · model grok-4.3

classification 🧮 math.NA cs.NA
keywords residual minimizationneural networksPDE approximationadaptive quadratureerror estimatesnumerical integrationStrang lemmacomposite quadrature
0
0 comments X

The pith

An adaptive quadrature method controls integration errors to improve neural network solutions of PDEs obtained by residual minimization.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper first builds an abstract error framework that isolates approximation, quadrature and optimization contributions to the total error in residual-minimizing neural PDE solvers. From this framework it derives a nonlinear Strang-type estimate that shows how quadrature inaccuracies propagate into the final neural approximation. Motivated by the estimate, the authors introduce an anisotropic adaptive composite quadrature that refines integration points via bisection until a relative error threshold is met against a richer reference rule, together with a refresh-based training loop that rebuilds the quadrature only when an online indicator exceeds a tolerance. Experiments on several benchmark problems demonstrate that the approach brings training losses closer to reference losses, uses fewer quadrature points than fixed rules, and yields higher final accuracy. A reader should care because uncontrolled quadrature error is a hidden bottleneck that otherwise limits the reliability of neural PDE methods.

Core claim

By separating approximation, quadrature and optimization errors in an abstract framework and deriving a nonlinear Strang-type estimate, the authors justify an anisotropic adaptive composite quadrature that controls relative quadrature error through richer reference rules and bisection refinement; this is paired with a refresh-based training procedure that rebuilds the quadrature only when an online error indicator exceeds a threshold. Numerical tests on benchmark PDEs show that the resulting training loss stays closer to the reference loss, quadrature points are used more efficiently, and final approximation accuracy exceeds that of non-adaptive strategies.

What carries the argument

The anisotropic adaptive composite quadrature strategy, which uses bisection-based refinement against richer reference quadratures to keep relative quadrature error of the residual loss below a threshold, combined with a refresh-based training loop triggered by an online error indicator.

If this is right

  • The gap between the training loss and a reference loss computed with very fine quadrature narrows.
  • Quadrature points are allocated more efficiently than with fixed non-adaptive rules.
  • Final neural approximations achieve higher accuracy than those obtained with non-adaptive quadrature strategies.
  • Computational cost is balanced by rebuilding the quadrature only when the online indicator signals that error has grown too large.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same refresh-based logic could be applied to other loss functions that rely on numerical integration, such as energy minimization or variational formulations.
  • The abstract error framework offers a template for analyzing Monte Carlo or low-discrepancy sampling errors in neural PDE solvers.
  • Because the refinement is anisotropic, the method may naturally extend to problems with strong directional features such as boundary layers or anisotropic coefficients.

Load-bearing premise

The abstract error framework and derived nonlinear Strang-type estimate correctly quantify how quadrature inaccuracies affect the final neural approximation, and the online error indicator reliably detects when quadrature error becomes significant.

What would settle it

A benchmark PDE test in which the adaptive method leaves a large persistent gap between training loss and a reference loss computed on a much finer quadrature, or in which the online indicator fails to trigger refinement in subdomains where the residual is known to be large.

Figures

Figures reproduced from arXiv: 2605.00308 by Kishore Nori, Santiago Badia.

Figure 1
Figure 1. Figure 1: Adaptive quadrature solution, final adaptive partition and absolute point-wise errors for the function approximation problem (18) using the AQ algorithm. epochs 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 10−15 10−10 10−5 100 loss ref. loss integration error AQ refresh epochs (a) Training and reference loss histories to￾gether with the cell-wise cumulative integra￾tion error of the primal quadratu… view at source ↗
Figure 2
Figure 2. Figure 2: Adaptive training diagnostics for the function approximation in (18). progresses. Similar AQ training-diagnostic figures arise throughout the remaining experiments, so we do not repeat them systematically. Instead, we show them again only for the most challenging Navier-Stokes benchmarks. We now present a comprehensive comparison of loss curves and approximation errors across all quadrature strategies in F… view at source ↗
Figure 3
Figure 3. Figure 3: Loss, error and verification histories for the function approximation in (18). uniform partitioning and revealing Runge phenomena. This is evident in the escalating H1 error progression for most part of the training, as seen in Fig. 3b. Finally, since the approximation quality of the reference quadrature is essential for accurately estimating cell-wise errors in the primal quadrature, we validate its effec… view at source ↗
Figure 4
Figure 4. Figure 4: Comparison of adaptive, uniform and MC quadrature for the 1D advection￾diffusion problem (19) with ϵ = 0.005. We begin with the ϵ = 0.005 case, with Figs. 4a and 4b showcasing an accurate approximation of the solution with a very low and uniform error distribution, achieved with the adaptive quadrature strategy. Fig. 4a also shows the final quadrature partition distribution, displaying higher mesh density … view at source ↗
Figure 5
Figure 5. Figure 5: Loss and error histories for the 1D advection-diffusion problem (19) with ϵ = 0.001 the PDE (19) and ignoring the left Dirichlet boundary condition. MC quadrature exhibits similar behaviour to the ϵ = 0.005 case; the corresponding plots are omitted for brevity because they look very similar to those for the ϵ = 0.005 case. The loss histories in Fig. 5a highlight the robustness of the adaptive quadrature st… view at source ↗
Figure 6
Figure 6. Figure 6: Adaptive quadrature solution, final adaptive partition and absolute point-wise errors for the viscous Burgers’ problem using the AQ algorithm. the high-gradient region while exhibiting clear anisotropy as the shock sharpens over time. Conversely, the mesh remains sparse in regions away from the shock, particularly as time t increases, which suggests near-optimal behaviour. The point-wise errors in view at source ↗
Figure 7
Figure 7. Figure 7: Loss and error histories for the viscous Burgers’ problem. (a) NN approximation with final AQ partition (b) Absolute point-wise errors view at source ↗
Figure 8
Figure 8. Figure 8: Adaptive quadrature solution, final adaptive partition and absolute point-wise errors for the (1+1)D Korteweg-De Vries (KdV) problem (22) using the AQ algorithm. constructed in [44], which makes the benchmark especially useful. Using the same NN architecture (3 view at source ↗
Figure 9
Figure 9. Figure 9: Loss and error histories for the (1+1)D Korteweg-De Vries (KdV) problem. layers with 30 neurons) as in [44] and training for only 20,000 SSBroyden optimiser epochs—omitting the Adam pre-training used in [44]—we obtain an excellent approximation of the target solution. As shown in Fig. 8a, the interaction of the waves is captured particularly well, with the final adaptive quadrature mesh closely tracking th… view at source ↗
Figure 10
Figure 10. Figure 10: Adaptive quadrature solution, final adaptive partition and absolute point-wise errors for the Cahn-Hilliard problem (23) using the AQ algorithm. Its progress remains markedly non-monotonic and it ends with the largest final errors among the tested methods, suggesting convergence to a suboptimal local minimum. The behaviour of both QMC variants is more extremely still: they remain in a slow-convergence reg… view at source ↗
Figure 11
Figure 11. Figure 11: Loss and error histories for the Cahn-Hilliard problem (23). exception of uniform quadrature, the other strategies show relatively close alignment between training and reference losses. This does not, however, translate into equally accurate solutions. The adaptive run reaches a maximum of 695 partitions, equivalent to 34,055 primal quadrature points, and still achieves substantially lower errors than the… view at source ↗
Figure 12
Figure 12. Figure 12: Adaptive quadrature solution, final adaptive partition and absolute point-wise errors for the 2D arc wavefront Poisson problem using the AQ algorithm. boundary data. The figure also highlights the ability of the h-adaptive process to resolve the high-gradient region efficiently, capturing the anisotropy of the sharp arc and concentrating effort where the solution is less regular. The adaptive run requires… view at source ↗
Figure 13
Figure 13. Figure 13: Loss and error histories for the 2D arc wavefront diffusion problem. (a) NN approximation with final AQ partition (b) Absolute point-wise errors view at source ↗
Figure 14
Figure 14. Figure 14: Adaptive quadrature solution, final adaptive partition and absolute point-wise errors for the the L-shaped Poisson problem using the AQ algorithm view at source ↗
Figure 15
Figure 15. Figure 15: Loss and error histories for the L-shaped Poisson problem. factor. For the alternative approaches, the reference loss plateaus despite the continued reduction of the training loss. 6.9. Approximation on a non-trivial domain. We now consider a convection–diffusion–reaction problem on a non-convex double-rhombi domain, as described in [33]. The problem is defined as follows: −∇ · (κ∇u) + (β · ∇)u = f in Ω ⊂… view at source ↗
Figure 16
Figure 16. Figure 16: Adaptive-quadrature solution and final mesh for the convection–diffusion– reaction problem (29) on the non-trivial domain using the AQ algorithm. We begin with the resolution of the circular gradient transition achieved through the adaptive mesh updates. As shown in view at source ↗
Figure 17
Figure 17. Figure 17: Loss and error histories for the convection–diffusion–reaction problem (29) on the non-trivial domain. computational budget compared to the 100,000 iterations required by the the state-of-the-art PirateNets [48]. This choice reflects our empirical observation that, for this problem, training yields only marginal incremental benefits beyond a certain iteration threshold. Throughout the optimisation, the Di… view at source ↗
Figure 18
Figure 18. Figure 18: Adaptive-quadrature training diagnostics for the 2D Navier-Stokes lid-driven cavity benchmark problem. (a) NN approximation with final AQ partition (b) Streamlines of the NN approximation y 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 u @ (x = 0.5) −0.3 0.0 0.3 0.6 0.9 approx ref. (Ghia et. al (1982)) (c) Transverse horizontal velocity comparison x 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 v @ (y = 0.5) … view at source ↗
Figure 19
Figure 19. Figure 19: Fluid-velocity results for the 2D Navier-Stokes lid-driven cavity and transverse velocity comparison against the benchmark data of [19] view at source ↗
Figure 20
Figure 20. Figure 20: Training and reference loss history comparison for the 2D Stokes (Moffatt) problem across all considered quadrature strategies. reported by PirateNets [48]. Notably, the PirateNets results were obtained using a significantly deeper residual-based architecture and a training budget of 100,000 epochs, whereas our approach utilizes a more compact network and 25,000 iterations. This comparison highlights the … view at source ↗
Figure 21
Figure 21. Figure 21: Streamlines for 2D Stokes (Moffatt) when trained with adaptive quadrature and QMC (LHC). alternative, where the smaller vortices are only partially resolved. Away from the bottom corner, the two solutions are already broadly similar, so the main visible difference is the more complete recovery of the fine vortex structure by AQ. The adaptive flow structures are better formed than the best solution reporte… view at source ↗
Figure 22
Figure 22. Figure 22: Loss and error histories for the nonlinear convection–diffusion–reaction problem in (32). in Fig. 22b shows growing discrepancy between training and reference losses. This shows that adaptive quadrature is essential for obtaining robust approximations in parametric nonlinear problems of this type. 6.13. The (2+1)D Parabolic Poisson equation. We now consider a challenging (2+1)D parabolic heat equation fro… view at source ↗
Figure 23
Figure 23. Figure 23: Loss and error histories for the (2+1)D Parabolic Poisson problem. width 25 with tanh activation. Training is run for at most 15,000 epochs, with the initial condition enforced through a penalty term analogous to the boundary data and the Dirichlet penalty parameter fixed at 10 view at source ↗
Figure 24
Figure 24. Figure 24: Loss and error histories for the 3D convection-diffusion problem. We take ϵ = 10−2 and β = (2, 1, 1). The domain is Ω = (−1, 1)3 , and f and g are chosen so that the manufactured solution is u(x, y, z) =  1 − e −(1−x)/ϵ 1 − e −(1−y)/ϵ 1 − e −(1−z)/ϵ cos (π (x + y + z)). (38) For this problem, we use the Xiao-Gimbutas quadrature [55] with degree pair (9, 11), again to improve data-efficiency relative… view at source ↗
read the original abstract

We study the role of numerical quadrature in residual-minimisation methods for neural network approximation of partial differential equations. We first present an abstract error framework that separates approximation, quadrature and optimisation errors, and derive a nonlinear Strang-type estimate quantifying how inaccuracies in the discrete loss affect the final approximation. Motivated by this analysis, we propose an anisotropic adaptive composite quadrature strategy that controls the relative quadrature error of the residual loss using richer reference quadratures and bisection-based refinement. We then introduce a refresh-based training methodology that rebuilds the quadrature only when an online error indicator exceeds a prescribed threshold, balancing accuracy and computational cost. Numerical experiments on a range of benchmark problems show that the proposed approach narrows the gap between training and reference losses, uses quadrature points more efficiently and delivers strong approximation accuracy relative to non-adaptive quadrature strategies.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 3 minor

Summary. The paper presents an abstract error framework separating approximation, quadrature, and optimization errors in residual-minimization methods for neural network approximations of PDEs. It derives a nonlinear Strang-type estimate to quantify the effect of quadrature inaccuracies on the final approximation. Motivated by this, the authors propose an anisotropic adaptive composite quadrature strategy using richer reference quadratures and bisection-based refinement to control relative quadrature error, along with a refresh-based training methodology that rebuilds the quadrature only when an online error indicator exceeds a threshold. Numerical experiments on benchmark problems claim that the approach narrows the gap between training and reference losses, uses quadrature points more efficiently, and achieves strong approximation accuracy relative to non-adaptive strategies.

Significance. If the error framework and Strang-type estimate hold under the conditions of the method, and the adaptive strategy reliably detects and controls quadrature error, this could be a meaningful contribution to improving the reliability and efficiency of neural PDE solvers. The refresh mechanism offers a practical way to balance cost and accuracy, and the experimental results on benchmarks suggest potential for broader adoption if the improvements prove robust beyond the tested cases.

major comments (2)
  1. [§3] §3 (nonlinear Strang-type estimate): The central claim that the estimate quantifies how quadrature inaccuracies affect the neural approximation depends on assumptions (e.g., Lipschitz continuity or local convexity of the loss) that are not explicitly verified for the non-convex residual losses typical in PINN training. The derivation should state these assumptions clearly and include a check or counterexample showing when they hold or fail, as this underpins the motivation for the adaptive quadrature.
  2. [§5] §5 (numerical experiments): The reported improvements in narrowing the train-reference loss gap and efficient quadrature use rely on the online error indicator correlating with actual approximation degradation. The manuscript should provide more detail on how the reference quadratures are constructed, the number of independent runs, and statistical measures of improvement to confirm the gains are not problem-specific or due to particular hyperparameter choices.
minor comments (3)
  1. [§4] The notation for the anisotropic composite quadrature and the error indicator in §4 could be clarified, perhaps with a pseudocode algorithm or diagram showing the bisection refinement process.
  2. [Introduction] A few sentences in the introduction could better distinguish the proposed method from prior adaptive quadrature techniques in finite elements or other neural PDE papers to highlight novelty.
  3. [§5] Figure captions for the benchmark results should explicitly state the quadrature point counts and loss values for both adaptive and non-adaptive cases to aid direct comparison.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive report and positive assessment of the potential contribution. We address each major comment below and will incorporate the indicated revisions in the next version of the manuscript.

read point-by-point responses
  1. Referee: [§3] §3 (nonlinear Strang-type estimate): The central claim that the estimate quantifies how quadrature inaccuracies affect the neural approximation depends on assumptions (e.g., Lipschitz continuity or local convexity of the loss) that are not explicitly verified for the non-convex residual losses typical in PINN training. The derivation should state these assumptions clearly and include a check or counterexample showing when they hold or fail, as this underpins the motivation for the adaptive quadrature.

    Authors: We agree that the assumptions should be stated more explicitly. The nonlinear Strang-type estimate is derived in an abstract setting under the hypothesis that the loss is locally Lipschitz continuous with respect to quadrature perturbations; this is satisfied for smooth residuals but need not hold globally for non-convex PINN losses. We will revise §3 to list the assumptions explicitly and add a short discussion of their local validity near approximate minima, where the loss behaves more convexly. A full counterexample lies outside the scope of the abstract framework, but we will note the conditions under which the bound may become loose. revision: yes

  2. Referee: [§5] §5 (numerical experiments): The reported improvements in narrowing the train-reference loss gap and efficient quadrature use rely on the online error indicator correlating with actual approximation degradation. The manuscript should provide more detail on how the reference quadratures are constructed, the number of independent runs, and statistical measures of improvement to confirm the gains are not problem-specific or due to particular hyperparameter choices.

    Authors: We will expand the experimental section to describe the reference quadratures in detail (high-order adaptive tensor-product Gauss-Legendre rules with bisection refinement), report all metrics as averages over 10 independent runs with distinct random seeds, and include means together with standard deviations for the training-reference loss gaps and approximation errors. These additions will demonstrate that the observed improvements are robust across initializations and not tied to specific hyperparameter choices. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper first presents an abstract error framework that separates approximation, quadrature, and optimisation errors, then derives a nonlinear Strang-type estimate from it. This analysis is positioned as independent motivation for the proposed adaptive composite quadrature and refresh-based training. The numerical experiments validate the approach on external benchmark problems rather than using fitted outcomes to define or predict the method itself. No self-citations are invoked as load-bearing uniqueness theorems, no ansatz is smuggled via prior work, and no known result is merely renamed. The derivation remains self-contained against the stated benchmarks and error analysis.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on an abstract error framework and standard numerical-analysis assumptions about quadrature convergence; no free parameters or invented entities are introduced in the abstract.

axioms (1)
  • domain assumption Abstract error framework that separates approximation, quadrature, and optimisation errors
    Invoked to derive the nonlinear Strang-type estimate quantifying quadrature error impact.

pith-pipeline@v0.9.0 · 5434 in / 1224 out tokens · 49164 ms · 2026-05-09T19:32:12.114559+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

56 extracted references · 48 canonical work pages · 1 internal anchor

  1. [1]

    Interpolation error estimates inW 1,p for degenerateQ1 isoparametric elements

    G. Acosta and G. Monzón. “Interpolation error estimates inW 1,p for degenerateQ1 isoparametric elements”. In:Numerische Mathematik, 104.2 (2006), pp. 129–150.doi:10.1007/s00211-006- 0018-1

  2. [2]

    Learning in PINNs: Phasetransition,diffusionequilibrium,andgeneralization

    S. J. Anagnostopoulos, J. D. Toscano, N. Stergiopulos, and G. E. Karniadakis. “Learning in PINNs: Phasetransition,diffusionequilibrium,andgeneralization”.In:NeuralNetworks,193,107983(2026). doi:https://doi.org/10.1016/j.neunet.2025.107983

  3. [3]

    Numerical Experience with a Class of Self-Scaling Quasi-Newton Algorithms

    M. Al-Baali. “Numerical Experience with a Class of Self-Scaling Quasi-Newton Algorithms”. In:Journal of Optimization Theory and Applications(1998), pp. 533–553.doi: 10 . 1023 / A : 1022608410710

  4. [4]

    Badia, W

    S. Badia, W. Li, and A. F. Martín. “Adaptive finite element interpolated neural networks”. In: Computer Methods in Applied Mechanics and Engineering, 437, 117806 (2025).doi:https : //doi.org/10.1016/j.cma.2025.117806

  5. [5]

    Compatible finite element interpolated neural networks

    S. Badia, W. Li, and A. F. Martín. “Compatible finite element interpolated neural networks”. In: Computer Methods in Applied Mechanics and Engineering, 439, 117889 (2025).doi:https : //doi.org/10.1016/j.cma.2025.117889

  6. [6]

    Badia, F

    S. Badia and F. Verdugo. “Gridap: An extensible Finite Element toolbox in Julia”. In:Journal of Open Source Software, 5.52, 2520 (2020).doi:10.21105/joss.02520

  7. [7]

    Berrone, C

    S. Berrone, C. Canuto, and M. Pintore. “Variational Physics Informed Neural Networks: the Role of Quadratures and Test Functions”. In:Journal of Scientific Computing, 92.3, 100 (2022).doi: 10.1007/s10915-022-01950-4

  8. [8]

    Effective Extensible Programming: Unleashing Julia on GPUs

    T. Besard, C. Foket, and B. De Sutter. “Effective Extensible Programming: Unleashing Julia on GPUs”. In:IEEE Transactions on Parallel and Distributed Systems(2018).doi:10.1109/TPDS. 2018.2872064. arXiv:1712.03112 [cs.PL]

  9. [9]

    Bezanson, A

    J. Bezanson, A. Edelman, S. Karpinski, and V. B. Shah. “Julia: A Fresh Approach to Numerical Computing”. In:SIAM Review, 59.1 (2017), pp. 65–98.doi:10.1137/141000671

  10. [10]

    Deep least-squares methods: An unsupervised learning-based numerical method for solving elliptic PDEs

    Z. Cai, J. Chen, M. Liu, and X. Liu. “Deep least-squares methods: An unsupervised learning-based numerical method for solving elliptic PDEs”. In:Journal of Computational Physics, 420, 109707 (2020).doi:https://doi.org/10.1016/j.jcp.2020.109707

  11. [11]

    Makie.jl: Flexible high-performance data visualization for Julia

    S. Danisch and J. Krumbiegel. “Makie.jl: Flexible high-performance data visualization for Julia”. In: Journal of Open Source Software, 6.65 (2021), p. 3349.doi:10.21105/joss.03349

  12. [12]

    Taherkhani, A

    T.DeRyck,S.Lanthaler,andS.Mishra.“Ontheapproximationoffunctionsbytanhneuralnetworks”. In:Neural Networks, 143 (2021), pp. 732–750.doi:https://doi.org/10.1016/j.neunet. 2021.08.015

  13. [13]

    Neural-network-based approximations for solving partial differential equations

    M. Dissanayake and N. Phan-Thien. “Neural-network-based approximations for solving partial differential equations”. In:Communications in Numerical Methods in Engineering, 10.3 (1994), pp. 195–201.doi:https://doi.org/10.1002/cnm.1640100303. REFERENCES 34

  14. [14]

    Finite Elements II

    A. Ern and J.-L. Guermond.Finite Elements II. 1st ed. Springer Cham, 2021.doi:https://doi. org/10.1007/978-3-030-56923-5

  15. [15]

    Computational Math with Neural Networks is Hard

    M. Feischl and F. Zehetgruber. “Computational Math with Neural Networks is Hard”. In:arXiv pre-printrepository(2025).Referredpre-printversion- [v1].arXiv: 2505.17751 [math.NA].url: https://arxiv.org/abs/2505.17751

  16. [16]

    Remarks on algorithm 006: An adaptive algorithm for numerical integration over an N-dimensional rectangular region

    A. Genz and A. Malik. “Remarks on algorithm 006: An adaptive algorithm for numerical integration over an N-dimensional rectangular region”. In:Journal of Computational and Applied Mathematics, 6.4 (1980), pp. 295–302.doi:https://doi.org/10.1016/0771-050X(80)90039-X

  17. [17]

    An adaptive numerical integration algorithm for simplices

    A. Genz. “An adaptive numerical integration algorithm for simplices”. In:Computing in the 90’s. Ed. by N. A. Sherwani, E. de Doncker, and J. A. Kapenga. New York, NY: Springer New York, 1991, pp. 279–285.url:https://doi.org/10.1007/BFb0038504

  18. [18]

    An adaptive numerical cubature algorithm for simplices

    A. Genz and R. Cools. “An adaptive numerical cubature algorithm for simplices”. In:ACM Trans. Math. Softw., 29.3 (2003), pp. 297–308.doi:10.1145/838250.838254

  19. [19]

    High-Re solutions for incompressible flow using the Navier-Stokes equations and a multigrid method

    U. Ghia, K. Ghia, and C. Shin. “High-Re solutions for incompressible flow using the Navier-Stokes equations and a multigrid method”. In:Journal of Computational Physics, 48.3 (1982), pp. 387–411. doi:https://doi.org/10.1016/0021-9991(82)90058-4

  20. [20]

    Understandingthedifficultyoftrainingdeepfeedforwardneuralnetworks

    X.GlorotandY.Bengio.“Understandingthedifficultyoftrainingdeepfeedforwardneuralnetworks”. In:Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics. Ed. by Y. W. Teh and M. Titterington. Vol. 9. Proceedings of Machine Learning Research. Chia Laguna Resort, Sardinia, Italy: PMLR, 2010, pp. 249–256.url:https://procee...

  21. [21]

    AFiniteElementTechniqueforSolvingFirst-OrderPDEsin Lp

    J.L.Guermond.“AFiniteElementTechniqueforSolvingFirst-OrderPDEsin Lp”.In:SIAMJournal on Numerical Analysis, 42.2 (2004), pp. 714–737.doi:10.1137/S0036142902417054

  22. [22]

    Algorithm 851: CG_DESCENT, a conjugate gradient method with guaranteed descent

    W. W. Hager and H. Zhang. “Algorithm 851: CG_DESCENT, a conjugate gradient method with guaranteed descent”. In:ACM Trans. Math. Softw., 32.1 (2006), pp. 113–137.doi:10 . 1145 / 1132973.1132979

  23. [23]

    Don’tUnrollAdjoint:DifferentiatingSSA-FormPrograms

    M.Innes.“Don’tUnrollAdjoint:DifferentiatingSSA-FormPrograms”.In:arXivpre-printrepository (2019). Referred pre-print version -[v4].url: https://doi.org/10.48550/arXiv.1810. 07951

  24. [24]

    Flux: Elegant machine learning with Julia

    M. Innes. “Flux: Elegant machine learning with Julia”. In:Journal of Open Source Software, 3.25, 602 (2018).doi:10.21105/joss.00602

  25. [25]

    S. G. Johnson.The HCubature.jl package for multi-dimensional adaptive integration in Julia. https://github.com/JuliaMath/HCubature.jl. 2017

  26. [26]

    Fast and power efficient GPU-based explicit elastic wave propagation analysis by low- ordered orthogonal voxel finite element with INT8 tensor cores

    E. Kiyani, K. Shukla, J. F. Urbán, J. Darbon, and G. E. Karniadakis. “Optimizing the optimizer for physics-informed neural networks and Kolmogorov-Arnold networks”. In:Computer Methods in Applied Mechanics and Engineering, 446, 118308 (2025).doi:https://doi.org/10.1016/j. cma.2025.118308

  27. [27]

    In: Advances in Neural Information Processing Systems, vol

    A.S.Krishnapriyan,A.Gholami,S.Zhe,R.M.Kirby,andM.W.Mahoney.“Characterizingpossible failure modes in physics-informed neural networks”. In:arXiv pre-print repository(2021). Referred pre-print version -[v2].url:https://doi.org/10.48550/arXiv.2109.01050

  28. [28]

    IEEE Transactions on Neural Networks 9(5), 987–1000 (1998) https://doi.org/10.1109/72.712178

    I. Lagaris, A. Likas, and D. Fotiadis. “Artificial neural networks for solving ordinary and partial differential equations”. In:IEEE Transactions on Neural Networks, 9.5 (1998), pp. 987–1000.doi: 10.1109/72.712178

  29. [29]

    Fourier Neural Operator for Parametric Partial Differential Equations

    Z.Lietal.“FourierNeuralOperatorforParametricPartialDifferentialEquations”.In:arXivpre-print repository(2021).Referredpre-printversion- [v3].url: https://arxiv.org/abs/2010.08895

  30. [30]

    Adaptive two-layer ReLU neural network: I. Best least-squares approximation

    M. Liu, Z. Cai, and J. Chen. “Adaptive two-layer ReLU neural network: I. Best least-squares approximation”. In:Computers & Mathematics with Applications, 113 (2022), pp. 34–44.doi: https://doi.org/10.1016/j.camwa.2022.03.005

  31. [31]

    Deep Ritz method with adaptive quadrature for linear elasticity

    M. Liu, Z. Cai, and K. Ramani. “Deep Ritz method with adaptive quadrature for linear elasticity”. In:Computer Methods in Applied Mechanics and Engineering, 415, 116229 (2023).doi:https: //doi.org/10.1016/j.cma.2023.116229

  32. [32]

    LearningnonlinearoperatorsviaDeepONet based on the universal approximation theorem of operators

    L.Lu,P.Jin,G.Pang,Z.Zhang,andG.E.Karniadakis.“LearningnonlinearoperatorsviaDeepONet based on the universal approximation theorem of operators”. In:Nature Machine Intelligence, 3.3 (2021), pp. 218–229. REFERENCES 35

  33. [33]

    Adaptive quadratures for nonlinear approximation of low-dimensional PDEs using smooth neural networks

    A. Magueresse and S. Badia. “Adaptive quadratures for nonlinear approximation of low-dimensional PDEs using smooth neural networks”. In:Computers & Mathematics with Applications, 162 (2024), pp. 1–21.doi:https://doi.org/10.1016/j.camwa.2024.02.041

  34. [34]

    W. F. Mitchell.NIST Adaptive Mesh Refinement (AMR) Benchmark Problems. Last updated March 6,

  35. [35]

    2013.url:https://math.nist

    National Institute of Standards and Technology (NIST). 2013.url:https://math.nist. gov/amr-benchmark/index.html(visited on 12/22/2025)

  36. [36]

    Optim: A mathematical optimization package for Julia

    P. K. Mogensen and A. N. Riseth. “Optim: A mathematical optimization package for Julia”. In: Journal of Open Source Software, 3.24, 615 (2018).doi:10.21105/joss.00615

  37. [37]

    Morokoff, R

    W. J. Morokoff and R. E. Caflisch. “Quasi-Monte Carlo Integration”. In:Journal of Computational Physics, 122.2 (1995), pp. 218–230.doi:https://doi.org/10.1006/jcph.1995.1209

  38. [38]

    Physics-informed neural networks: A deep learning frameworkforsolvingforwardandinverseproblemsinvolvingnonlinearpartialdifferentialequations

    M. Raissi, P. Perdikaris, and G. Karniadakis. “Physics-informed neural networks: A deep learning frameworkforsolvingforwardandinverseproblemsinvolvingnonlinearpartialdifferentialequations”. In:JournalofComputationalPhysics,378(2019),pp.686–707.doi: https://doi.org/10.1016/ j.jcp.2018.10.045

  39. [39]

    On quadrature rules for solving Partial Differential Equations using Neural Networks

    J. A. Rivera, J. M. Taylor, Á. J. Omella, and D. Pardo. “On quadrature rules for solving Partial Differential Equations using Neural Networks”. In:Computer Methods in Applied Mechanics and Engineering, 393, 114710 (2022).doi:https://doi.org/10.1016/j.cma.2022.114710

  40. [40]

    Robust Variational Physics- Informed Neural Networks

    S. Rojas, P. Maczuga, J. Muñoz-Matute, D. Pardo, and M. Paszyński. “Robust Variational Physics- Informed Neural Networks”. In:Computer Methods in Applied Mechanics and Engineering, 425, 116904 (2024).doi:https://doi.org/10.1016/j.cma.2024.116904

  41. [41]

    Koyama, Hao Zhang, Teruhisa S

    I. Sakiotis et al. “PAGANI: a parallel adaptive GPU algorithm for numerical integration”. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. SC ’21. St. Louis, Missouri: Association for Computing Machinery, 2021.doi: 10.1145/3458817.3476198

  42. [42]

    DGM: A deep learning algorithm for solving partial differential equations , volume=

    J. Sirignano and K. Spiliopoulos. “DGM: A deep learning algorithm for solving partial differential equations”. In:Journal of Computational Physics, 375 (2018), pp. 1339–1364.doi:https : //doi.org/10.1016/j.jcp.2018.08.029

  43. [43]

    AdaptiveMultidimensionalQuadrature on Multi-GPU Systems

    M.Tonarelli,S.Riva,P.Benedusi,F.Ferrandi,andR.Krause.“AdaptiveMultidimensionalQuadrature on Multi-GPU Systems”. In:arXiv pre-print repository(2025). Referred pre-print version -[v1]. arXiv:2511.01573 [cs.DC].url:https://arxiv.org/abs/2511.01573

  44. [44]

    A Variational Framework for Residual-Based Adaptivity in Neural PDE Solvers and Operator Learning

    J. D. Toscano, D. T. Chen, V. Oommen, J. Darbon, and G. E. Karniadakis. “A Variational Framework for Residual-Based Adaptivity in Neural PDE Solvers and Operator Learning”. In:arXiv pre- print repository(2025). Referred pre-print version -[v2]. arXiv: 2509.14198 [cs.LG] .url: https://arxiv.org/abs/2509.14198

  45. [45]

    Unveiling the optimization process of physics informed neural networks: How accurate and competitive can PINNs be?

    J. F. Urbán, P. Stefanou, and J. A. Pons. “Unveiling the optimization process of physics informed neural networks: How accurate and competitive can PINNs be?” In:Journal of Computational Physics, 523, 113656 (2025).doi:https://doi.org/10.1016/j.jcp.2024.113656

  46. [46]

    An adaptive algorithm for numerical integration over an n- dimensionalcube

    P. van Dooren and L. de Ridder. “An adaptive algorithm for numerical integration over an n- dimensionalcube”.In:JournalofComputationalandAppliedMathematics,2.3(1976),pp.207–217. doi:https://doi.org/10.1016/0771-050X(76)90005-X

  47. [47]

    Verdugo, S

    F. Verdugo and S. Badia. “The software design of Gridap: A Finite Element package based on the Julia JIT compiler”. In:Computer Physics Communications, 276, 108341 (2022).doi: 10.1016/j.cpc.2022.108341

  48. [48]

    Gradient Alignment in Physics-informed Neural Networks: A Second-Order Optimization Perspective

    S. Wang, A. K. Bhartari, B. Li, and P. Perdikaris. “Gradient Alignment in Physics-informed Neural Networks: A Second-Order Optimization Perspective”. In:The Thirty-ninth Annual Conference on Neural Information Processing Systems. 2025.url:https://openreview.net/forum?id= iweeVl1RHU

  49. [49]

    PirateNets:physics-informeddeeplearningwithresidual adaptive networks

    S.Wang,B.Li,Y.Chen,andP.Perdikaris.“PirateNets:physics-informeddeeplearningwithresidual adaptive networks”. In:J. Mach. Learn. Res., 25.1, 402 (2024).url:https://dl.acm.org/doi/ 10.5555/3722577.3722979

  50. [50]

    An expert’s guide to training physics-informed neural networks.arXiv preprint arXiv:2308.08468, 2023

    S. Wang, S. Sankaran, H. Wang, and P. Perdikaris. “An Expert’s Guide to Training Physics-informed Neural Networks”. In:arXiv pre-print repository(2023). Referred pre-print version -[v1]. arXiv: 2308.08468 [cs.LG].url:https://arxiv.org/abs/2308.08468. REFERENCES 36

  51. [51]

    Understanding and Mitigating Gradient Flow Pathologies in Physics-Informed Neural Networks

    S. Wang, Y. Teng, and P. Perdikaris. “Understanding and Mitigating Gradient Flow Pathologies in Physics-Informed Neural Networks”. In:SIAM Journal on Scientific Computing, 43.5 (2021), A3055–A3081.doi:10.1137/20M1318043. eprint:https://doi.org/10.1137/20M1318043

  52. [52]

    When and why PINNs fail to train: A neural tangent kernel perspective.Journal of Computational Physics, 449:110768, 2022

    S. Wang, X. Yu, and P. Perdikaris. “When and why PINNs fail to train: A neural tangent kernel perspective”. In:Journal of Computational Physics, 449, 110768 (2022).doi:https://doi.org/ 10.1016/j.jcp.2021.110768

  53. [53]

    Solution multiplicity and effects of data and eddy viscosity on Navier-Stokes solutions inferred by physics-informed neural networks

    Z. Wang, X. Meng, X. Jiang, H. Xiang, and G. E. Karniadakis. “Solution multiplicity and effects of data and eddy viscosity on Navier-Stokes solutions inferred by physics-informed neural networks”. In:arXiv pre-print repository(2023). Referred pre-print version - [v1]. arXiv: 2309 . 06010 [physics.flu-dyn].url:https://arxiv.org/abs/2309.06010

  54. [54]

    Solving Allen-Cahn and Cahn-Hilliard Equations Using the Adaptive Physics Informed Neural Networks

    C. L. Wight and J. Zhao. “Solving Allen-Cahn and Cahn-Hilliard Equations Using the Adaptive Physics Informed Neural Networks”. In:Communications in Computational Physics, 29.3 (2021), pp. 930–954.doi:10.4208/cicp.OA-2020-0086

  55. [55]

    On the identification of symmetric quadrature rules for finite element methods

    F. Witherden and P. Vincent. “On the identification of symmetric quadrature rules for finite element methods”. In:Computers & Mathematics with Applications, 69.10 (2015), pp. 1232–1241.doi: https://doi.org/10.1016/j.camwa.2015.03.017

  56. [56]

    A numerical algorithm for the construction of efficient quadrature rules in two and higher dimensions

    H. Xiao and Z. Gimbutas. “A numerical algorithm for the construction of efficient quadrature rules in two and higher dimensions”. In:Computers & Mathematics with Applications, 59.2 (2010), pp. 663–676.doi:https://doi.org/10.1016/j.camwa.2009.10.027