pith. machine review for the scientific record. sign in

arxiv: 2604.28180 · v1 · submitted 2026-04-30 · 💻 cs.LG

Recognition: unknown

An adaptive wavelet-based PINN for problems with localized high-magnitude source

Authors on Pith no claims yet

Pith reviewed 2026-05-07 07:06 UTC · model grok-4.3

classification 💻 cs.LG
keywords adaptive wavelet PINNphysics-informed neural networkslocalized source termsloss imbalanceGaussian process limitneural tangent kernelPDE solvingmultiscale phenomena
0
0 comments X

The pith

AW-PINN dynamically adjusts wavelet bases to solve PDEs with localized high-magnitude sources despite loss imbalances up to 10^10:1

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Standard physics-informed neural networks struggle with differential equations containing sharp localized high-magnitude sources because the loss function becomes severely imbalanced, with one small region dominating training by factors as large as 10^10 to 1. AW-PINN counters this through a two-stage process: a short pre-training phase with fixed wavelet bases to select suitable families, followed by adaptive refinement of scales and translations driven by both residual and supervised losses. The framework computes required derivatives without automatic differentiation and avoids populating high-resolution bases over the entire domain, keeping memory use manageable. If the approach works as described, it enables reliable solutions for physical problems such as transient heat conduction with point sources, localized Poisson equations, oscillatory flows, and Maxwell equations with point charges. The paper also shows that under certain assumptions the method admits a Gaussian process limit with an associated neural tangent kernel structure.

Core claim

The paper establishes that an adaptive wavelet-based PINN (AW-PINN) solves differential equations with localized high-magnitude source terms by dynamically adjusting wavelet basis functions according to residual and supervised loss values. The process starts with brief pre-training on fixed bases to identify relevant wavelet families, then refines scales and translations adaptively without filling the full domain at high resolution. Derivatives enter the loss without automatic differentiation, which speeds training. Under stated assumptions the network admits a Gaussian process limit whose neural tangent kernel structure is derived explicitly. Tests on transient heat conduction, highly-local

What carries the argument

Dynamic adjustment of wavelet basis functions driven by residual and supervised losses in a two-stage pre-training plus adaptive refinement process, without automatic differentiation for derivatives

If this is right

  • Enables accurate simulation of physical systems with extreme source localization in thermal processing, electromagnetics, impact mechanics, and fluid dynamics
  • Accelerates training by removing automatic differentiation from the loss computation
  • Keeps memory costs low by refining bases only where residuals indicate need rather than across the whole domain
  • Supplies a Gaussian process equivalence that may allow kernel-based analysis of training dynamics in similar hybrid networks
  • Extends the range of solvable PDEs to cases where loss imbalance previously prevented convergence

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same two-stage adaptation idea could be applied to other adaptive bases such as radial basis functions for different classes of PDEs
  • If the Gaussian process limit proves robust, it could support uncertainty quantification for wavelet-augmented PINNs in engineering applications
  • Combining the adaptive wavelet step with importance sampling or curriculum learning might further improve performance on time-dependent or high-dimensional problems
  • The memory-efficiency claim could be tested directly by measuring peak GPU usage on three-dimensional versions of the reported examples

Load-bearing premise

The dynamic adjustment of wavelet basis functions based on residual and supervised loss effectively captures high-scale localized features without excessive memory cost, and the assumptions needed for the Gaussian process limit hold for the tested PDEs

What would settle it

On one of the reported PDEs with a 10^10:1 loss imbalance, such as the localized Poisson problem, if AW-PINN shows no gain in accuracy or speed over standard PINNs or requires memory comparable to a uniform high-resolution grid, the claimed practical advantage is falsified

Figures

Figures reproduced from arXiv: 2604.28180 by Himanshu Pandey, Ratikanta Behera.

Figure 3.1
Figure 3.1. Figure 3.1: A schematic architecture of AW-PINN. Parameters for purple wavelet units and final linear layer are initialized using Iadapt and cadapt, respectively. The symbol ψ denotes the wavelet activation function and ψi,m = ψ(wi,mxm + bi,m), while ⊗ represents element-wise multiplication. where ψ(·) is the same mother wavelet used in the Equation (4). The initial parameters are transferred from IA as, w (0) i,n =… view at source ↗
Figure 4.2
Figure 4.2. Figure 4.2: Left to Right: Heat source function, the prediction using AW-PINN and respective absolute point-wise error for problem 4.1 with ϵ = 0.1 view at source ↗
Figure 4.3
Figure 4.3. Figure 4.3: Left: Average log loss plots for each loss terms in problem 4.1 with ϵ = 0.12 using AW-PINN. Shaded regions denote the standard deviation across 10 independent runs. The vertical dashed line marks the completion of pre-training with W-PINN using 1000 Adam iterations. Right: Log relative L2-error plots of different methods for problem 4.1 with ϵ = 0.12. The vertical dashed line indicates the end of the pr… view at source ↗
Figure 4.4
Figure 4.4. Figure 4.4: Top - Left to Right: The source function, exact solution, and prediction using AW-PINN for problem 4.2 with ϵ = 0.02. Bottom - Left to Right: Absolute point-wise error of W-PINN, MMPINN, and AW-PINN, respectively. 0 500 1000 1500 2000 Wavelet family index 0 15 30 45 x -scale values W-PINN scales AW-PINN scales 0 500 1000 1500 2000 Wavelet family index −10 10 30 y -scale values W-PINN scales AW-PINN scale… view at source ↗
Figure 4.5
Figure 4.5. Figure 4.5: Plot of x-scale (left) and y-scale (right) adaptation for problem 4.2. Here, solid lines denote dyadic scales and dashed lines represent scales after adaptation. in view at source ↗
Figure 4.6
Figure 4.6. Figure 4.6: Top - Left to Right: The source function, the prediction using AW-PINN, and the corresponding absolute point-wise error for problem 4.3. Bottom: Cross-section comparison of the prediction with the exact solution at various x-domain snapshots. porating momentum and energy exchange, turbulence models governed by two-equation formu￾lations, and reactive flows involving chemical processes. For this example, … view at source ↗
Figure 4.7
Figure 4.7. Figure 4.7: The electromagnetic field prediction using AW-PINN for TEz Maxwell’s equations 4.4 at the top panel and their respective point-wise absolute error at the bottom panel. 5 Conclusion In this study, we addressed the challenges encountered during the training of PINNs due to the severe loss imbalance between different loss components. In particular, we considered prob￾lems with localized high-magnitude sourc… view at source ↗
read the original abstract

In recent years, physics-informed neural networks (PINNs) have gained significant attention for solving differential equations, although they suffer from two fundamental limitations, namely, spectral bias inherent in neural networks and loss imbalance arising from multiscale phenomena. This paper proposes an adaptive wavelet-based PINN (AW-PINN) to address the extreme loss imbalance characteristic of problems with localized high-magnitude source terms. Such problems frequently arise in various physical applications, such as thermal processing, electro-magnetics, impact mechanics, and fluid dynamics involving localized forcing. The proposed framework dynamically adjusts the wavelet basis function based on residual and supervised loss. This adaptive nature makes AW-PINN handle problems with high-scale features effectively without being memory-intensive. Additionally, AW-PINN does not rely on automatic differentiation to obtain derivatives involved in the loss function, which accelerates the training process. The method operates in two stages, an initial short pre-training phase with fixed bases to select physically relevant wavelet families, followed by an adaptive refinement that adapts scales and translations without populating high-resolution bases across entire domains. Theoretically, we show that under certain assumptions, AW-PINN admits a Gaussian process limit and derive its associated NTK structure. We evaluate AW-PINN on several challenging PDEs featuring localized high-magnitude source terms with extreme loss imbalances having ratios up to $10^{10}:1$. Across these PDEs, including transient heat conduction, highly localized Poisson problems, oscillatory flow equations, and Maxwell equations with a point charge source, AW-PINN consistently outperforms existing methods in its class.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper proposes an adaptive wavelet-based PINN (AW-PINN) to solve PDEs featuring localized high-magnitude source terms that induce extreme loss imbalances (up to 10^{10}:1). The method uses a two-stage procedure: a short pre-training phase with fixed wavelet bases to select relevant families, followed by dynamic adaptation of scales and translations driven by residual and supervised losses. It avoids automatic differentiation for computing derivatives in the loss and claims to handle high-scale localized features without excessive memory cost. Theoretically, under stated assumptions, AW-PINN is shown to admit a Gaussian process limit with an associated NTK structure. Experiments on transient heat conduction, highly localized Poisson problems, oscillatory flows, and Maxwell equations with point sources report consistent outperformance over existing PINN variants.

Significance. If the empirical gains are reproducible and the theoretical analysis is made consistent with the adaptive procedure, the work would meaningfully extend PINN capabilities to multiscale problems with sharp, localized features common in thermal, electromagnetic, and fluid applications. The two-stage wavelet adaptation offers a practical route to memory efficiency and the avoidance of AD provides a computational speedup. A valid GP/NTK characterization could supply insight into training dynamics, though the adaptive component raises questions about whether standard infinite-width analyses apply directly.

major comments (2)
  1. [Theoretical analysis] Theoretical section (GP limit and NTK derivation): Standard NTK analysis requires a fixed architecture whose width tends to infinity before training, yielding a data-independent kernel. The manuscript describes an adaptive stage that selects scales and translations on the basis of the current residual and supervised losses after the fixed-basis pre-training. For the reported PDEs with loss ratios up to 10^{10}:1, this adaptation is essential to the claimed performance. The derivation therefore cannot be invoked to explain or guarantee behavior of the final adaptive model unless the paper either (a) restricts the GP/NTK claim to the pre-training phase only or (b) extends the analysis to data-dependent, time-varying bases. The current statement of assumptions does not address this mismatch.
  2. [Experiments] Experimental evaluation: The abstract asserts consistent outperformance across multiple PDEs, yet the provided summary supplies no quantitative metrics, error bars, ablation studies on the adaptation rules, or implementation details (e.g., exact criteria for scale/translation selection, memory scaling with adaptation). Without these, it is impossible to verify whether the reported gains are robust or sensitive to post-hoc choices in the adaptation heuristics.
minor comments (1)
  1. [Method] The description of how the initial wavelet family is chosen and how adaptation rules are implemented should be expanded with pseudocode or explicit equations to aid reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed review. The comments highlight important points regarding the scope of the theoretical analysis and the need for greater experimental detail. We address each major comment below and will revise the manuscript accordingly to improve clarity and rigor.

read point-by-point responses
  1. Referee: [Theoretical analysis] Theoretical section (GP limit and NTK derivation): Standard NTK analysis requires a fixed architecture whose width tends to infinity before training, yielding a data-independent kernel. The manuscript describes an adaptive stage that selects scales and translations on the basis of the current residual and supervised losses after the fixed-basis pre-training. For the reported PDEs with loss ratios up to 10^{10}:1, this adaptation is essential to the claimed performance. The derivation therefore cannot be invoked to explain or guarantee behavior of the final adaptive model unless the paper either (a) restricts the GP/NTK claim to the pre-training phase only or (b) extends the analysis to data-dependent, time-varying bases. The current statement of assumptions does not address this mismatch.

    Authors: We agree that standard NTK/GP limits assume a fixed architecture. Our derivation is performed under the assumption of a fixed wavelet basis during the infinite-width limit, which aligns with the pre-training phase. The adaptive stage is a practical, finite-width heuristic driven by loss-based selection of scales and translations. In the revision, we will explicitly restrict the GP limit and associated NTK claims to the pre-training phase only, and we will add a discussion clarifying that the adaptive procedure lies outside the current theoretical framework. This resolves the mismatch without requiring an extension of the analysis to time-varying bases. revision: yes

  2. Referee: [Experiments] Experimental evaluation: The abstract asserts consistent outperformance across multiple PDEs, yet the provided summary supplies no quantitative metrics, error bars, ablation studies on the adaptation rules, or implementation details (e.g., exact criteria for scale/translation selection, memory scaling with adaptation). Without these, it is impossible to verify whether the reported gains are robust or sensitive to post-hoc choices in the adaptation heuristics.

    Authors: The manuscript reports numerical comparisons on the listed PDEs, but we acknowledge that additional quantitative details and reproducibility information are required. In the revised version we will add tables of relative L2 errors and training times with standard deviations computed over multiple independent runs, ablation studies on the loss-threshold criteria and wavelet-selection rules, and explicit pseudocode for the adaptation procedure. We will also include scaling plots of memory usage versus number of adapted wavelets and discuss sensitivity to the adaptation hyperparameters. revision: yes

Circularity Check

0 steps flagged

No significant circularity; NTK result presented as separate theoretical claim under assumptions

full rationale

The paper describes a two-stage procedure (fixed-basis pre-training followed by adaptive wavelet refinement) and separately states that 'under certain assumptions, AW-PINN admits a Gaussian process limit and derive its associated NTK structure.' No equation is shown to reduce by construction to a fitted quantity, no self-citation is invoked as the sole justification for a uniqueness theorem, and the adaptive mechanism is not redefined in terms of the NTK itself. The derivation chain therefore remains self-contained against external benchmarks; any mismatch between fixed-basis NTK assumptions and the final adaptive model is a question of applicability rather than circularity.

Axiom & Free-Parameter Ledger

2 free parameters · 1 axioms · 0 invented entities

The central claim rests on the effectiveness of loss-driven wavelet adaptation and the validity of the assumptions for the Gaussian process limit. The abstract does not quantify specific hyperparameters, but the adaptive mechanism and family selection during pre-training function as free parameters whose values are determined during training.

free parameters (2)
  • Initial wavelet family selection
    Chosen during the short pre-training phase based on residual and supervised loss; exact families and selection criteria are not specified in the abstract.
  • Adaptation rules for scales and translations
    Dynamically adjusted without populating high-resolution bases everywhere; the precise update mechanism and thresholds are not detailed.
axioms (1)
  • domain assumption Under certain assumptions, AW-PINN admits a Gaussian process limit and associated NTK structure
    Stated directly in the abstract as the basis for the theoretical analysis.

pith-pipeline@v0.9.0 · 5579 in / 1572 out tokens · 113502 ms · 2026-05-07T07:06:33.615277+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

29 extracted references · 21 canonical work pages · 1 internal anchor

  1. [1]

    A. G. Baydin, B. A. Pearlmutter, A. A. Radul, J. M. Siskind, Automatic differentiation in machine learning: a survey , J. Mach. Learn. Res. 18 (153) (2018) 1–43. URL http://jmlr.org/papers/v18/17-468.html

  2. [2]

    Paszke, S

    A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, A. Desmaison, A. Kopf, E. Yang, Z. DeVito, M. Raison, A. Tejani, S. Chilamkurthy, B. Steiner, L. Fang, J. Bai, S. Chintala, Pytorch: An imperative style, high-performance deep learning library , in: Advances in Neural Information Processing Syste...

  3. [3]

    Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations

    M. Raissi, P. Perdikaris, G. Karniadakis, Physics-informed neural networks: A deep learning frame- work for solving forward and inverse problems involving nonlinear partial differential equations, J. Comput. Phys. 378 (2019) 686–707. doi:10.1016/j.jcp.2018.10.045

  4. [4]

    Z. Hu, K. Shukla, G. E. Karniadakis, K. Kawaguchi, Tackling the curse of dimensionality with physics-informed neural networks, Neural Networks 176 (2024) 106369. doi:10.1016/j.neunet. 2024.106369

  5. [5]

    Sahli Costabal, S

    F. Sahli Costabal, S. Pezzuto, P. Perdikaris, Δ-pinns: Physics-informed neural networks on complex geometries, Eng. Appl. Artif. Intell. 127 (2024) 107324. doi:10.1016/j.engappai.2023.107324

  6. [6]

    Yuan, Y.-Q

    L. Yuan, Y.-Q. Ni, X.-Y. Deng, S. Hao, A-pinn: Auxiliary physics informed neural networks for forward and inverse problems of nonlinear integro-differential equations, J. Comput. Phys. 462 (2022) 111260. doi:10.1016/j.jcp.2022.111260

  7. [7]

    Zhang, L

    D. Zhang, L. Guo, G. E. Karniadakis, Learning in modal space: Solving time-dependent stochastic pdes using physics-informed neural networks, SIAM J. Sci. Comput. 42 (2) (2020) A639–A665. doi:10.1137/19M1260141

  8. [8]

    2018.10.045

    Tushar, S. Chakraborty, Deep physics corrector: A physics enhanced deep learning architecture for solving stochastic differential equations, J. Comput. Phys. 479 (2023) 112004. doi:10.1016/j.jcp. 2023.112004

  9. [9]

    G. Pang, L. Lu, G. E. Karniadakis, fpinns: Fractional physics-informed neural networks, SIAM J. Sci. Comput. 41 (4) (2019) A2603–A2626. doi:10.1137/18M1229845

  10. [10]

    S. S. M., P. Kumar, V. Govindaraj, A novel optimization-based physics-informed neural network scheme for solving fractional differential equations, Engineering with Computers 40 (2) (2024) 855–

  11. [11]

    doi:10.1007/s00366-023-01830-x

  12. [12]

    Scientific machine learning through physics-informed neural networks: where we are and what’s next.Journal of Scientific Computing, 92(3):88, 2022

    S. Cuomo, V. S. D. Cola, F. Giampaolo, G. Rozza, M. Raissi, F. Piccialli, Scientific machine learning through physics–informed neural networks: Where we are and what’s next, J. Sci. Comput. 92 (3) (2022) 88. doi:10.1007/s10915-022-01939-z

  13. [13]

    Zhang, W

    W. Zhang, W. Suo, J. Song, W. Cao, Physics informed neural networks (pinns) as intelligent com- puting technique for solving partial differential equations: Limitation and future prospects, arXiv preprint arXiv:2411.18240 (2024). doi:10.48550/arXiv.2411.18240

  14. [14]

    C. Zhao, F. Zhang, W. Lou, X. Wang, J. Yang, A comprehensive review of advances in physics- informed neural networks and their applications in complex fluid dynamics, Phys. Fluids. 36 (10) (2024) 101301. doi:10.1063/5.0226562

  15. [15]

    Rahaman, A

    N. Rahaman, A. Baratin, D. Arpit, F. Draxler, M. Lin, F. Hamprecht, Y. Bengio, A. Courville, On the spectral bias of neural networks , in: K. Chaudhuri, R. Salakhutdinov (Eds.), Proceedings of the 36th International Conference on Machine Learning, Vol. 97 of Proceedings of Machine Learning 17 Research, PMLR, 2019, pp. 5301–5310. URL https://proceedings.ml...

  16. [16]

    Z.-Q. J. Xu, Y. Zhang, T. Luo, Overview frequency principle/spectral bias in deep learning, Com- mun. Appl. Math. Comput. 7 (3) (2025) 827–864. doi:10.1007/s42967-024-00398-7

  17. [17]

    Jacot, F

    A. Jacot, F. Gabriel, C. Hongler, Neural tangent kernel: Convergence and generalization in neural networks, in: Advances in Neural Information Processing Systems, Vol. 31, Curran Associates, Inc., 2018. URL https://proceedings.neurips.cc/paper_files/paper/2018/file/ 5a4be1fa34e62bb8a6ec6b91d2462f5a-Paper.pdf

  18. [18]

    S. Wang, X. Yu, P. Perdikaris, When and why pinns fail to train: A neural tangent kernel perspective, J. Comput. Phys. 449 (2022) 110768. doi:https://doi.org/10.1016/j.jcp.2021.110768

  19. [19]

    S. Wang, Y. Teng, P. Perdikaris, Understanding and mitigating gradient flow pathologies in physics-informed neural networks, SIAM J. Sci. Comput. 43 (5) (2021) A3055–A3081. doi: 10.1137/20M1318043

  20. [20]

    Tancik, P

    M. Tancik, P. Srinivasan, B. Mildenhall, S. Fridovich-Keil, N. Raghavan, U. Singhal, R. Ra- mamoorthi, J. Barron, R. Ng, Fourier features let networks learn high frequency functions in low dimensional domains , in: H. Larochelle, M. Ranzato, R. Hadsell, M. Balcan, H. Lin (Eds.), Advances in Neural Information Processing Systems, Vol. 33, Curran Associates...

  21. [21]

    S. Wang, H. Wang, P. Perdikaris, On the eigenvector bias of fourier feature networks: From regres- sion to solving multi-scale pdes with physics-informed neural networks , Comput. Methods Appl. Mech. Eng 384 (2021) 113938. doi:https://doi.org/10.1016/j.cma.2021.113938. URL https://www.sciencedirect.com/science/article/pii/S0045782521002759

  22. [22]

    Y. Liu, H. Gu, X. Yu, P. Qin, Diminishing spectral bias in physics-informed neural networks using spatially-adaptive fourier feature encoding , Neural Networks 182 (2025) 106886. doi:https://doi. org/10.1016/j.neunet.2024.106886. URL https://www.sciencedirect.com/science/article/pii/S0893608024008153

  23. [23]

    L. D. McClenny, U. M. Braga-Neto, Self-adaptive physics-informed neural networks, J. Comput. Phys. 474 (2023) 111722. doi:10.1016/j.jcp.2022.111722

  24. [24]

    Y. Wang, Y. Yao, J. Guo, Z. Gao, A practical pinn framework for multi-scale problems with multi- magnitude loss terms, J. Comput. Phys. 510 (2024) 113112. doi:10.1016/j.jcp.2024.113112

  25. [25]

    Bischof, M

    R. Bischof, M. A. Kraus, Multi-objective loss balancing for physics-informed deep learning, Comput. Methods Appl. Mech. Engrg. 439 (2025) 117914. doi:10.1016/j.cma.2025.117914

  26. [26]

    Pandey, A

    H. Pandey, A. Singh, R. Behera, An efficient wavelet-based physics-informed neural networks for singularly perturbed problems (2025). arXiv:2409.11847. URL https://arxiv.org/abs/2409.11847

  27. [27]

    D. P. Kingma, J. Ba, Adam: A method for stochastic optimization (2017). arXiv:1412.6980. URL https://arxiv.org/abs/1412.6980

  28. [28]

    Nocedal, Updating quasi-newton matrices with limited storage , Math

    J. Nocedal, Updating quasi-newton matrices with limited storage , Math. Comp. 35 (151) (1980) 773–782. URL https://courses.grainger.illinois.edu/ece544na/fa2014/nocedal80.pdf

  29. [29]

    Glorot, Y

    X. Glorot, Y. Bengio, Understanding the difficulty of training deep feedforward neural networks , in: Y. W. Teh, M. Titterington (Eds.), Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, Vol. 9 of Proceedings of Machine Learning Research, PMLR, Chia Laguna Resort, Sardinia, Italy, 2010, pp. 249–256. URL https...