Recognition: unknown
Frequency Bias and OOD Generalization in Neural Operators under a Variable-Coefficient Wave Equation
Pith reviewed 2026-05-14 19:43 UTC · model grok-4.3
The pith
FNO shows sharp error jumps on unseen high frequencies in wave equations while DeepONet degrades more gradually.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Under structured out-of-distribution settings that vary input frequency independently of coefficient smoothness, the Fourier Neural Operator exhibits a sharp increase in error on unseen high-frequency inputs, while the Deep Operator Network shows milder degradation despite its higher baseline error. Both architectures remain stable under shifts in coefficient smoothness, with FNO achieving lower error in that regime. The performance differences arise from each architecture's distinct representation of and response to variations in frequency structure.
What carries the argument
Structured OOD test settings on the variable-coefficient wave equation that separately control input frequency content and coefficient smoothness to isolate architectural biases in FNO versus DeepONet.
If this is right
- FNO requires training data or architectural modifications that cover broader frequency ranges to avoid abrupt failure modes.
- DeepONet may be more suitable than FNO for wave problems where input frequencies can vary widely after deployment.
- Operator learning pipelines must explicitly account for frequency content when constructing training distributions to achieve reliable generalization.
- The choice between spectral and branch-trunk architectures directly influences robustness to frequency shifts in physics-informed operator models.
Where Pith is reading between the lines
- Similar frequency-shift tests on other PDE families such as advection or elasticity could reveal whether the observed bias pattern generalizes beyond waves.
- Explicit frequency augmentation during training might reduce the gap between in-distribution accuracy and OOD performance for spectral operators.
- The results suggest that representation bias in neural operators can be diagnosed by measuring error sensitivity to isolated input features like frequency.
- Extending the analysis to multi-dimensional or nonlinear wave equations would test whether the one-dimensional findings scale to more realistic settings.
Load-bearing premise
That independently varying input frequency and coefficient smoothness produces distribution shifts representative of those encountered in practical PDE applications.
What would settle it
A controlled experiment that trains FNO on a dataset spanning the full target frequency range and then measures whether the sharp error increase on the highest frequencies disappears.
Figures
read the original abstract
Neural operators learn to map initial conditions to the terminal solution of partial differential equations (PDEs), providing a surrogate for the full operator mapping. This enables rapid prediction across different input configurations. While recent neural operator architectures have demonstrated strong performance on diverse PDE tasks, their behavior under structured distribution shifts remains insufficiently understood. To investigate this, we study operator learning in a wave propagation setting governed by a one-dimensional variable-coefficient wave equation, using two representative architectures, the Fourier Neural Operator (FNO) and the Deep Operator Network (DeepONet). To examine their generalization under distribution shifts, we consider structured out-of-distribution (OOD) settings that independently vary input frequency and coefficient smoothness. The results show that under smoothness shifts, both models maintain stable performance, with FNO achieving lower error. In contrast, under frequency shifts, FNO exhibits a sharp increase in error under unseen high-frequency inputs, whereas DeepONet shows milder degradation despite higher overall error. Our analysis reveals that these differences arise from how each architecture represents and responds to variations in frequency structure. Together, these findings highlight a fundamental gap between strong in-distribution performance and generalization under distribution shifts in operator learning, underscoring the role of architectural representation bias in developing more reliable neural operators for physics-based PDE simulations beyond the training distribution.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper examines out-of-distribution generalization of neural operators (FNO and DeepONet) for mapping initial conditions to terminal solutions of a 1D variable-coefficient wave equation. It considers structured shifts that independently vary input frequency and coefficient smoothness, reporting that both architectures remain stable under smoothness shifts (with FNO lower error), while FNO shows a sharp error increase on unseen high-frequency inputs and DeepONet degrades more mildly despite higher baseline error; the differences are attributed to architectural differences in frequency representation.
Significance. If the central empirical distinction holds after addressing potential confounds, the work would usefully illustrate how operator architectures encode frequency structure and why in-distribution performance does not guarantee robustness under realistic PDE distribution shifts. This is relevant for surrogate modeling in physics applications where input spectra can vary.
major comments (2)
- [Results / Experimental Setup] The claim that observed error differences arise from intrinsic frequency-representation bias (abstract and results) is load-bearing for the central conclusion, yet the experimental design does not isolate frequency shifts from coefficient interactions. In the variable-coefficient wave equation, high-frequency initial conditions can generate solution components whose local wavelengths couple to spatial coefficient variations; keeping the marginal coefficient distribution fixed does not automatically remove this interaction. No coefficient-conditioned error curves, residual spectral analysis, or ablation that holds coefficient smoothness fixed while varying frequency content are reported to rule out the alternative explanation that FNO's Fourier modes are more sensitive to imperfect coefficient handling under high-frequency excitation.
- [Abstract and Experiments] The abstract and experimental sections supply no training details (optimizer, learning-rate schedule, number of epochs, batch size), error metrics (L2, relative L2, pointwise), statistical tests, or data-exclusion rules. This leaves the reported sharp error increase for FNO and the milder degradation for DeepONet only partially supported and difficult to reproduce or compare with prior operator-learning benchmarks.
minor comments (2)
- [Methods] Notation for the variable-coefficient wave equation (e.g., the precise form of the coefficient function and boundary conditions) should be stated explicitly in the methods section rather than assumed from the abstract.
- [Figures] Figure captions and axis labels for error-vs-frequency plots should include the exact definition of the error norm and the range of frequencies used in training versus OOD test sets.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback. We address each major comment below and describe the revisions that will be incorporated.
read point-by-point responses
-
Referee: [Results / Experimental Setup] The claim that observed error differences arise from intrinsic frequency-representation bias (abstract and results) is load-bearing for the central conclusion, yet the experimental design does not isolate frequency shifts from coefficient interactions. In the variable-coefficient wave equation, high-frequency initial conditions can generate solution components whose local wavelengths couple to spatial coefficient variations; keeping the marginal coefficient distribution fixed does not automatically remove this interaction. No coefficient-conditioned error curves, residual spectral analysis, or ablation that holds coefficient smoothness fixed while varying frequency content are reported to rule out the alternative explanation that FNO's Fourier modes are more sensitive to imperfect coefficient handling under high-frequency excitation.
Authors: We appreciate the referee's observation on possible residual interactions. Our design independently varies frequency content and coefficient smoothness while holding marginal coefficient distributions fixed, and the resulting error patterns are consistent with known architectural properties (FNO's global Fourier basis versus DeepONet's local branch-trunk structure). Nevertheless, we agree that explicit isolation would strengthen the attribution. In the revision we will add coefficient-conditioned error curves, residual spectral analysis of the solutions, and an ablation that holds coefficient smoothness fixed while sweeping frequency content. These additions will directly address the alternative explanation and better support the architectural-bias interpretation. revision: partial
-
Referee: [Abstract and Experiments] The abstract and experimental sections supply no training details (optimizer, learning-rate schedule, number of epochs, batch size), error metrics (L2, relative L2, pointwise), statistical tests, or data-exclusion rules. This leaves the reported sharp error increase for FNO and the milder degradation for DeepONet only partially supported and difficult to reproduce or compare with prior operator-learning benchmarks.
Authors: We agree that these implementation details are necessary for reproducibility and fair comparison. The revised manuscript will expand the experimental section to report the optimizer, learning-rate schedule, number of epochs, batch size, the precise error metric (relative L2 norm), results of statistical tests across multiple random seeds, and the data-exclusion criteria. These additions will make the reported OOD trends fully supported and directly comparable to existing neural-operator benchmarks. revision: yes
Circularity Check
No circularity: purely empirical comparison with no derivations or self-referential predictions
full rationale
The paper conducts an empirical study training FNO and DeepONet on solutions to the 1D variable-coefficient wave equation and measuring test error under controlled shifts in input frequency and coefficient smoothness. No mathematical derivations, uniqueness theorems, ansatzes, or predictions are claimed; results are reported directly from numerical experiments. The central observations (FNO error spike under high-frequency OOD inputs, milder DeepONet degradation) are data-driven comparisons, not quantities that reduce by construction to fitted parameters or self-citations. No load-bearing steps match any of the enumerated circularity patterns.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption The one-dimensional variable-coefficient wave equation serves as a representative testbed for studying operator-learning generalization under structured distribution shifts.
Reference graph
Works this paper leans on
-
[1]
P. Langley , title =. Proceedings of the 17th International Conference on Machine Learning (ICML 2000) , address =. 2000 , pages =
work page 2000
-
[2]
T. M. Mitchell. The Need for Biases in Learning Generalizations. 1980
work page 1980
-
[3]
M. J. Kearns , title =
-
[4]
Machine Learning: An Artificial Intelligence Approach, Vol. I. 1983
work page 1983
-
[5]
R. O. Duda and P. E. Hart and D. G. Stork. Pattern Classification. 2000
work page 2000
-
[6]
Suppressed for Anonymity , author=
-
[7]
A. Newell and P. S. Rosenbloom. Mechanisms of Skill Acquisition and the Law of Practice. Cognitive Skills and Their Acquisition. 1981
work page 1981
-
[8]
A. L. Samuel. Some Studies in Machine Learning Using the Game of Checkers. IBM Journal of Research and Development. 1959
work page 1959
- [9]
-
[10]
Finite Volume Methods for Hyperbolic Problems , author =. 2002 , publisher =
work page 2002
-
[11]
Finite Difference Schemes and Partial Differential Equations , author =. 2004 , publisher =
work page 2004
-
[12]
Applied Partial Differential Equations , author =. 2015 , publisher =
work page 2015
-
[13]
Partial Differential Equations: An Introduction , author =. 2007 , publisher =
work page 2007
-
[14]
Uncertainty Quantification: Theory, Implementation, and Applications , author =. 2013 , publisher =
work page 2013
- [15]
-
[16]
Reduced Basis Methods for Partial Differential Equations: An Introduction , author =. 2015 , publisher =
work page 2015
-
[17]
A survey of projection-based model reduction methods for parametric dynamical systems , author =. SIAM Review , volume =
-
[18]
Reduced Order Methods for Modeling and Computational Reduction , editor =. 2014 , publisher =
work page 2014
-
[19]
Data-Driven Science and Engineering: Machine Learning, Dynamical Systems, and Control , author =. 2019 , publisher =
work page 2019
-
[20]
Journal of Computational Physics , volume =
Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations , author =. Journal of Computational Physics , volume =
-
[21]
Nature Reviews Physics , volume =
Physics-informed machine learning , author =. Nature Reviews Physics , volume =
-
[22]
ACM Computing Surveys , volume =
Integrating scientific knowledge with machine learning for engineering and environmental systems , author =. ACM Computing Surveys , volume =
-
[23]
Journal of Scientific Computing , volume =
Scientific machine learning through physics-informed neural networks: Where we are and what's next , author =. Journal of Scientific Computing , volume =
-
[24]
United States Department of Energy Report , year =
Workshop report on basic research needs for scientific machine learning: Core technologies for artificial intelligence , author =. United States Department of Energy Report , year =
-
[25]
Learning nonlinear operators via
Lu, Lu and Jin, Pengzhan and Pang, Guofei and Zhang, Zhongqiang and Karniadakis, George Em , journal =. Learning nonlinear operators via
-
[26]
International Conference on Learning Representations , year =
Fourier Neural Operator for Parametric Partial Differential Equations , author =. International Conference on Learning Representations , year =
-
[27]
Neural operator: Learning maps between function spaces with applications to
Kovachki, Nikola and Li, Zongyi and Liu, Burigede and Azizzadenesheli, Kamyar and Bhattacharya, Kaushik and Stuart, Andrew and Anandkumar, Anima , journal =. Neural operator: Learning maps between function spaces with applications to
-
[28]
Pathak, Jaideep and Subramanian, Shashank and Harrington, Peter and Raja, Sanjeev and Chattopadhyay, Ashesh and Mardani, Morteza and Kurth, Thorsten and Hall, David and Li, Zongyi and Azizzadenesheli, Kamyar and Hassanzadeh, Pedram and Kashinath, Karthik and Anandkumar, Anima , journal =
-
[29]
Kissas, Georgios and Yang, Yaohua and Hwuang, En-Jui and Witschey, Walter R. and Detre, John A. and Perdikaris, Paris , journal =. Machine learning in cardiovascular flows modeling: Predicting arterial blood pressure from non-invasive 4D flow
-
[30]
Deep learning methods for Reynolds-averaged
Thuerey, Nils and Wei. Deep learning methods for Reynolds-averaged. AIAA Journal , volume =
-
[31]
Universal differential equations for scientific machine learning , author =. arXiv preprint arXiv:2001.04385 , year =
-
[32]
ICLR Workshop on Integration of Deep Neural Models and Differential Equations , year =
Neural Operator: Graph Kernel Network for Partial Differential Equations , author =. ICLR Workshop on Integration of Deep Neural Models and Differential Equations , year =
-
[33]
Learning the solution operator of parametric partial differential equations with physics-informed
Wang, Sifan and Wang, Hanwen and Perdikaris, Paris , journal =. Learning the solution operator of parametric partial differential equations with physics-informed
-
[34]
arXiv preprint arXiv:2111.03794 , year =
Physics-informed neural operator for learning partial differential equations , author =. arXiv preprint arXiv:2111.03794 , year =
-
[35]
Advances in Neural Information Processing Systems , volume =
Choose a Transformer: Fourier or Galerkin , author =. Advances in Neural Information Processing Systems , volume =
-
[36]
and Azizzadenesheli, Kamyar , journal =
Rahman, Md Ashiqur and Ross, Zachary E. and Azizzadenesheli, Kamyar , journal =
-
[37]
Computer Methods in Applied Mechanics and Engineering , volume =
A comprehensive and fair comparison of two neural operators for data-driven PDE solving , author =. Computer Methods in Applied Mechanics and Engineering , volume =
-
[38]
Computer Methods in Applied Mechanics and Engineering , volume =
Wavelet neural operator for solving parametric partial differential equations in computational mechanics problems , author =. Computer Methods in Applied Mechanics and Engineering , volume =
-
[39]
Journal of Computational Physics , volume =
Koopman neural operator as a mesh-free solver of non-linear partial differential equations , author =. Journal of Computational Physics , volume =
-
[40]
Improved generalization with deep neural operators for engineering systems:
Kobayashi, Kazuma and Daniell, James and Alam, Syed Bahauddin , journal =. Improved generalization with deep neural operators for engineering systems:
-
[41]
Advances in Neural Information Processing Systems Datasets and Benchmarks Track , year=
AirfRANS: High Fidelity Computational Fluid Dynamics Dataset for Approximating Reynolds-Averaged Navier-Stokes Solutions , author=. Advances in Neural Information Processing Systems Datasets and Benchmarks Track , year=
-
[42]
ICLR Workshop on Geometrical and Topological Representation Learning , year=
An Extensible Benchmarking Graph-Mesh Dataset for Studying Steady-State Incompressible Navier-Stokes Equations , author=. ICLR Workshop on Geometrical and Topological Representation Learning , year=
-
[43]
and Welling, Max , booktitle =
Brandstetter, Johannes and Worrall, Daniel E. and Welling, Max , booktitle =. Message Passing Neural
-
[44]
Numerische Mathematik , volume =
Error estimates for deep learning methods in fluid dynamics , author =. Numerische Mathematik , volume =
-
[45]
Advances in Neural Information Processing Systems , volume=
Learning Dissipative Dynamics in Chaotic Systems , author=. Advances in Neural Information Processing Systems , volume=
- [46]
- [47]
- [48]
-
[49]
An overview of full-waveform inversion in exploration geophysics , author =. Geophysics , volume =
-
[50]
Virieux, Jean , journal =
-
[51]
Inversion of seismic reflection data in the acoustic approximation , author =. Geophysics , volume =
-
[52]
Waves and Fields in Inhomogeneous Media , author =. 1995 , publisher =
work page 1995
-
[53]
Finite Difference Methods for Ordinary and Partial Differential Equations: Steady-State and Time-Dependent Problems , author =. 2007 , publisher =
work page 2007
-
[54]
Mathematical Geosciences , volume =
Stochastic seismic waveform inversion using generative adversarial networks as a geological prior , author =. Mathematical Geosciences , volume =
-
[55]
Proceedings of the 40th International Conference on Machine Learning , pages =
Group Equivariant Fourier Neural Operators for Partial Differential Equations , author =. Proceedings of the 40th International Conference on Machine Learning , pages =
-
[56]
Hao, Zhongkai and Ying, Chengyang and Wang, Zhengyi and Su, Hang and Dong, Yinpeng and Liu, Songming and Cheng, Ze and Song, Jian and Zhu, Jun , booktitle =
-
[57]
Machine Learning: Science and Technology , volume =
Applications of physics informed neural operators , author =. Machine Learning: Science and Technology , volume =
- [58]
-
[59]
Communications on Pure and Applied Mathematics , volume =
Trainability and accuracy of artificial neural networks: An interacting particle system approach , author =. Communications on Pure and Applied Mathematics , volume =
-
[60]
Proceedings of the 41st International Conference on Machine Learning , year =
Transolver: A Fast Transformer Solver for PDEs on General Geometries , author =. Proceedings of the 41st International Conference on Machine Learning , year =
-
[61]
Advances in Neural Information Processing Systems , year =
Herde, Maximilian and Raoni. Advances in Neural Information Processing Systems , year =
-
[62]
Advances in Neural Information Processing Systems , year =
Universal Physics Transformers: A Framework for Efficiently Scaling Neural Operators , author =. Advances in Neural Information Processing Systems , year =
-
[63]
Temporal Neural Operator for Modeling Time-Dependent Physical Phenomena , author =. Scientific Reports , year =
-
[64]
Luo, Hongkai and others , year =. An Accurate Neural Solver for
-
[65]
Methods of Mathematical Physics,
Courant, Richard and Hilbert, David , year =. Methods of Mathematical Physics,
-
[66]
The Finite Element Method: Its Basis and Fundamentals , author=. 2005 , publisher=
work page 2005
-
[67]
Shields and George Em Karniadakis , title =
Somdatta Goswami and Katiana Kontolati and Michael D. Shields and George Em Karniadakis , title =. Nature Machine Intelligence , year =
-
[68]
Convolutional Neural Operators for Robust and Accurate Learning of
Bogdan Raoni. Convolutional Neural Operators for Robust and Accurate Learning of. Advances in Neural Information Processing Systems , volume =. 2023 , url =
work page 2023
-
[69]
Advances in Neural Information Processing Systems , volume =
Zongyi Li and Nikola Kovachki and Chris Choy and Boyi Li and Jean Kautz and Anima Anandkumar , title =. Advances in Neural Information Processing Systems , volume =. 2023 , url =
work page 2023
-
[70]
Communications on Applied Mathematics and Computation , year =
Zhi-Qin John Xu and Yaoyu Zhang and Tao Luo , title =. Communications on Applied Mathematics and Computation , year =. doi:10.48550/arXiv.2201.07395 , url =
-
[71]
Benjamin Erichson and Kush Bhatia and Michael W
Jerry Weihong Liu and N. Benjamin Erichson and Kush Bhatia and Michael W. Mahoney and Christopher R. Does In-Context Operator Learning Generalize to Domain-Shifted Settings? , booktitle =. 2023 , url =
work page 2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.