McMg: A Learned Phase-Space Multi-channel Multigrid Preconditioner for Helmholtz Equation
Pith reviewed 2026-06-30 04:39 UTC · model grok-4.3
The pith
McMg retains unresolved wave phase and direction in extra channels during coarsening to precondition high-wavenumber Helmholtz equations with fewer iterations.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
McMg coarsens physical space while retaining unresolved local wave information in the channel dimension so that each coarse node carries a learned packet of amplitude, phase, direction, and scattering coefficients rather than a single scalar. The architecture uses linear multi-channel transfer operators together with locally adaptive stencils and medium-dependent smoothers; nonlinear physical features are cached once, after which each iteration is linear in the residual. Layer-by-Layer Progressive Finetuning extends the learned operator to larger domains and higher effective wavenumbers by adding and tuning only new coarse levels.
What carries the argument
Multi-channel transfer operators that coarsen space while preserving phase-space wave packets in the channel dimension.
If this is right
- McMg requires substantially fewer iterations than classical multigrid on high-frequency heterogeneous problems.
- It uses less wall-clock time than existing neural preconditioners on large-scale three-dimensional cases.
- For any fixed medium the V-cycle remains linear after a single setup phase that caches medium-dependent coefficients.
- Models trained on small domains transfer directly or via LLPF to larger domains and higher effective wavenumbers.
Where Pith is reading between the lines
- The same channel-based retention of local propagation information could be tested on time-harmonic Maxwell or acoustic wave equations.
- Combining the learned coarse-grid packets with adaptive mesh refinement around high-contrast interfaces might further reduce iteration counts.
- The linearity of the online phase suggests the method could serve as a drop-in component inside existing iterative frameworks without changing outer solvers.
Load-bearing premise
Learned multi-channel operators and smoothers trained on small domains can be extended to larger domains and higher wavenumbers by adding and finetuning only new coarse levels without retraining the full model.
What would settle it
On a three-dimensional high-contrast Helmholtz problem whose domain size and wavenumber exceed the training range by a factor of two, McMg would fail to show fewer iterations than a strong classical multigrid baseline.
Figures
read the original abstract
Solving heterogeneous Helmholtz equations at high wavenumbers remains challenging because the discretized operator is indefinite, pollution degrades phase accuracy, and scalar coarse-grid correction can discard the local phase and propagation-direction information carried by oscillatory errors. We propose Multi-channel Multigrid (McMg), a learned phase-space multigrid preconditioner for heterogeneous Helmholtz equations. Rather than predicting the solution directly, McMg maps residuals to corrections within an iterative framework. Its central idea is to coarsen physical space while retaining unresolved local wave information in the channel dimension: each coarse node carries a learned packet of amplitude, phase, direction, and scattering coefficients rather than a single scalar unknown. The architecture combines linear multi-channel transfer operators with locally adaptive stencils, neural PDE operators, and medium-dependent smoothers whose coefficients are generated from the wave speed. For a fixed medium, the V-cycle is linear in the residual; nonlinear physical features are computed once in a setup phase and cached, so each online iteration reduces to convolutions with fixed coefficients. We further study generalization across scales. Models trained on small domains transfer directly to larger domains and higher effective wavenumbers, and a Layer-by-Layer Progressive Finetuning (LLPF) strategy extends the support of the learned Green's operator by adding and finetuning only new coarse levels. Numerical experiments on high-frequency, high-contrast, and large-scale three-dimensional problems demonstrate that McMg requires substantially fewer iterations and less wall-clock time than strong classical baselines, while consistently outperforming existing neural preconditioners.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes McMg, a learned phase-space multi-channel multigrid preconditioner for heterogeneous Helmholtz equations. It coarsens physical space while retaining local wave information (amplitude, phase, direction, scattering) in channel dimensions via learned multi-channel transfer operators, locally adaptive stencils, neural PDE operators, and medium-dependent smoothers (cached after setup). The V-cycle is linear for fixed media; generalization is achieved by direct transfer from small-domain training plus Layer-by-Layer Progressive Finetuning (LLPF) to add/finetune coarse levels for larger domains and higher wavenumbers. Numerical experiments claim substantially fewer iterations and lower wall-clock time than classical baselines and existing neural preconditioners on high-frequency, high-contrast, large-scale 3D problems.
Significance. If the empirical claims hold with reproducible setups, McMg would represent a meaningful advance in scalable preconditioning for indefinite high-wavenumber Helmholtz problems, by embedding phase-space information into multigrid and demonstrating practical generalization via LLPF without full retraining. This could impact applications requiring repeated solves at scale, provided the learned components transfer reliably.
major comments (2)
- [Numerical experiments / abstract] Numerical experiments (abstract and corresponding section): the central performance claim of substantially fewer iterations and less wall-clock time versus strong classical and neural baselines is asserted without concrete setup details (domain sizes, wavenumber ranges, contrast levels, grid resolutions), error metrics, baseline implementations, or statistical controls. This prevents verification of the headline advantage and makes the generalization claim via LLPF untestable from the reported data.
- [Generalization / LLPF description] Generalization via LLPF (abstract and method description): the claim that models trained on small domains transfer directly and that LLPF extends the learned Green's operator to larger domains/higher wavenumbers while preserving iteration counts relies on unquantified assertions. No data is supplied on channel-wise packet degradation under wavenumber increase, epochs needed versus full retraining, or effectiveness of cached medium-dependent smoothers when coarse-grid spacing changes; if this step fails, the reported 3D high-frequency gains do not follow.
minor comments (1)
- [Introduction / method overview] The abstract and method sections introduce several new entities (learned phase-space packets, multi-channel transfers) without a compact notation table or diagram clarifying the channel dimension versus physical coarsening.
Simulated Author's Rebuttal
We thank the referee for the constructive comments on reproducibility and the LLPF generalization strategy. We address both major points below by agreeing to enhance the manuscript with additional concrete details and quantitative data.
read point-by-point responses
-
Referee: [Numerical experiments / abstract] Numerical experiments (abstract and corresponding section): the central performance claim of substantially fewer iterations and less wall-clock time versus strong classical and neural baselines is asserted without concrete setup details (domain sizes, wavenumber ranges, contrast levels, grid resolutions), error metrics, baseline implementations, or statistical controls. This prevents verification of the headline advantage and makes the generalization claim via LLPF untestable from the reported data.
Authors: We agree that the abstract is high-level and that Section 4 would benefit from a consolidated summary of parameters for easier verification. In the revision we will insert a table listing all domain sizes, wavenumber ranges, contrast levels, grid resolutions, error metrics (relative residual norms), baseline solver implementations with tolerances, and any multi-run statistics. This directly addresses the verifiability concern without altering the existing experimental results. revision: yes
-
Referee: [Generalization / LLPF description] Generalization via LLPF (abstract and method description): the claim that models trained on small domains transfer directly and that LLPF extends the learned Green's operator to larger domains/higher wavenumbers while preserving iteration counts relies on unquantified assertions. No data is supplied on channel-wise packet degradation under wavenumber increase, epochs needed versus full retraining, or effectiveness of cached medium-dependent smoothers when coarse-grid spacing changes; if this step fails, the reported 3D high-frequency gains do not follow.
Authors: The current manuscript demonstrates successful transfer and LLPF through iteration-count preservation on the reported 3D cases, but we acknowledge the absence of explicit quantification on packet degradation, finetuning epoch counts, and smoother cache behavior under grid changes. We will add a new subsection with supporting tables or plots quantifying these quantities (channel retention metrics, epoch comparisons, and smoother validation across coarse spacings) to make the generalization claims fully substantiated. revision: yes
Circularity Check
No significant circularity; performance claims rest on independent numerical experiments
full rationale
The paper defines McMg as a learned preconditioner whose multi-channel operators and medium-dependent smoothers are trained on small domains, then applies the resulting fixed V-cycle to larger problems. The headline claim of substantially fewer iterations and lower wall-clock time versus classical and neural baselines is presented as the outcome of explicit numerical experiments on high-frequency, high-contrast 3D instances; these results are not shown to reduce by construction to the training loss or to any self-citation. The LLPF generalization step is an empirical assertion supported by the same experiments rather than a definitional identity. No self-definitional, fitted-input-renamed-as-prediction, or load-bearing self-citation steps appear in the supplied text.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Standard assumptions of multigrid convergence theory for indefinite operators hold when the coarse-grid correction is replaced by a learned multi-channel operator.
invented entities (1)
-
Learned phase-space packet (amplitude, phase, direction, scattering coefficients per coarse node)
no independent evidence
Reference graph
Works this paper leans on
-
[1]
International conference on machine learning , pages=
On enhancing expressive power via compositions of single fixed-size relu network , author=. International conference on machine learning , pages=. 2023 , organization=
2023
-
[2]
International Conference on Learning Representations , volume=
MgNO: Efficient parameterization of linear operators via multigrid , author=. International Conference on Learning Representations , volume=
-
[3]
Journal of Machine Learning Research , volume=
Neural operator: Learning maps between function spaces with applications to pdes , author=. Journal of Machine Learning Research , volume=
-
[4]
Proceedings of the AAAI conference on artificial intelligence , volume=
Learning across scales---multiscale methods for convolution neural networks , author=. Proceedings of the AAAI conference on artificial intelligence , volume=
-
[5]
International conference on machine learning , pages=
Beyond finite layer neural networks: Bridging deep architectures and numerical differential equations , author=. International conference on machine learning , pages=. 2018 , organization=
2018
-
[6]
Advances in neural information processing systems , volume=
Neural ordinary differential equations , author=. Advances in neural information processing systems , volume=
-
[7]
Advances in neural information processing systems , volume=
Deep equilibrium models , author=. Advances in neural information processing systems , volume=
-
[8]
Advances in neural information processing systems , volume=
On training implicit models , author=. Advances in neural information processing systems , volume=
-
[9]
Proceedings of the AAAI Conference on Artificial Intelligence , volume=
Jfb: Jacobian-free backpropagation for implicit networks , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=
-
[10]
Advances in Neural Information Processing Systems , volume=
Deep equilibrium based neural operators for steady-state pdes , author=. Advances in Neural Information Processing Systems , volume=
-
[11]
National Science Open , volume=
Learning neural operators on riemannian manifolds , author=. National Science Open , volume=. 2024 , publisher=
2024
-
[12]
Proceedings of the AAAI conference on artificial intelligence , volume=
Film: Visual reasoning with a general conditioning layer , author=. Proceedings of the AAAI conference on artificial intelligence , volume=
-
[13]
Advances in neural information processing systems , volume=
Dynamic filter networks , author=. Advances in neural information processing systems , volume=
-
[14]
Advances in neural information processing systems , volume=
Condconv: Conditionally parameterized convolutions for efficient inference , author=. Advances in neural information processing systems , volume=
-
[15]
Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
Dynamic convolution: Attention over convolution kernels , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
-
[16]
Hypernetworks , author=. arXiv preprint arXiv:1609.09106 , year=
work page internal anchor Pith review Pith/arXiv arXiv
-
[17]
Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
Pixel-adaptive convolutional neural networks , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
-
[18]
Links , volume=
A proposal on machine learning via dynamical systems , author=. Links , volume=
-
[19]
Proceedings of the 27th international conference on international conference on machine learning , pages=
Learning fast approximations of sparse coding , author=. Proceedings of the 27th international conference on international conference on machine learning , pages=
-
[20]
Journal of computational physics , volume=
A convergent Born series for solving the inhomogeneous Helmholtz equation in arbitrarily large media , author=. Journal of computational physics , volume=. 2016 , publisher=
2016
-
[21]
Applied Numerical Mathematics , volume=
On a class of preconditioners for solving the Helmholtz equation , author=. Applied Numerical Mathematics , volume=. 2004 , publisher=
2004
-
[22]
SIAM Journal on Numerical Analysis , volume=
A source transfer domain decomposition method for Helmholtz equations in unbounded domain , author=. SIAM Journal on Numerical Analysis , volume=. 2013 , publisher=
2013
-
[23]
Journal of Computational Physics , volume=
Trace transfer-based diagonal sweeping domain decomposition method for the Helmholtz equation: Algorithms and convergence analysis , author=. Journal of Computational Physics , volume=. 2022 , publisher=
2022
-
[24]
Numerical analysis of multiscale problems , pages=
Why it is difficult to solve Helmholtz problems with classical iterative methods , author=. Numerical analysis of multiscale problems , pages=. 2011 , publisher=
2011
-
[25]
Nature Reviews Physics , volume=
Neural operators for accelerating scientific simulations and design , author=. Nature Reviews Physics , volume=. 2024 , publisher=
2024
-
[26]
Journal of Computational physics , volume=
Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations , author=. Journal of Computational physics , volume=. 2019 , publisher=
2019
-
[27]
Nature machine intelligence , volume=
Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators , author=. Nature machine intelligence , volume=. 2021 , publisher=
2021
-
[28]
Fourier Neural Operator for Parametric Partial Differential Equations
Fourier neural operator for parametric partial differential equations , author=. arXiv preprint arXiv:2010.08895 , year=
work page internal anchor Pith review Pith/arXiv arXiv 2010
-
[29]
Journal of Machine Learning Research , volume=
Fourier neural operator with learned deformations for pdes on general geometries , author=. Journal of Machine Learning Research , volume=
-
[30]
arXiv preprint arXiv:2204.11127 (2022)
U-no: U-shaped neural operators , author=. arXiv preprint arXiv:2204.11127 , year=
-
[31]
Advances in neural information processing systems , volume=
Choose a transformer: Fourier or galerkin , author=. Advances in neural information processing systems , volume=
-
[32]
Journal of Computational Physics , volume=
Mitigating spectral bias for the multiscale operator learning , author=. Journal of Computational Physics , volume=. 2024 , publisher=
2024
-
[33]
International conference on machine learning , pages=
On the spectral bias of neural networks , author=. International conference on machine learning , pages=. 2019 , organization=
2019
-
[34]
The Thirteenth International Conference on Learning Representations , year=
MGCFNN: A Neural MultiGrid Solver with Novel Fourier Neural Network for High Wave Number Helmholtz Equations , author=. The Thirteenth International Conference on Learning Representations , year=
-
[35]
SIAM Journal on Scientific Computing , volume=
A neural multigrid solver for helmholtz equations with high wavenumber and heterogeneous media , author=. SIAM Journal on Scientific Computing , volume=. 2025 , publisher=
2025
-
[36]
Nature Machine Intelligence , volume=
Blending neural operators and relaxation methods in PDE numerical solvers , author=. Nature Machine Intelligence , volume=. 2024 , publisher=
2024
-
[37]
arXiv preprint arXiv:2402.05598 , year=
Neural operators meet conjugate gradients: The FCG-NO method for efficient PDE solving , author=. arXiv preprint arXiv:2402.05598 , year=
-
[38]
arXiv preprint arXiv:2312.11093 , year=
MGCNN: a learnable multigrid solver for sparse linear systems from PDEs on structured grids , author=. arXiv preprint arXiv:2312.11093 , year=
-
[39]
SIAM Journal on Scientific Computing , volume=
Multigrid-augmented deep learning preconditioners for the Helmholtz equation using compact implicit layers , author=. SIAM Journal on Scientific Computing , volume=. 2024 , publisher=
2024
-
[40]
Communications on pure and applied mathematics , volume=
Radiation boundary conditions for acoustic and elastic wave calculations , author=. Communications on pure and applied mathematics , volume=
-
[41]
Journal of computational physics , volume=
A perfectly matched layer for the absorption of electromagnetic waves , author=. Journal of computational physics , volume=. 1994 , publisher=
1994
-
[42]
2001 , publisher=
Chebyshev and Fourier spectral methods , author=. 2001 , publisher=
2001
-
[43]
Journal of computational physics , volume=
Approximation of radiation boundary conditions , author=. Journal of computational physics , volume=. 1981 , publisher=
1981
-
[44]
Wavelets, multilevel methods and elliptic PDEs , pages=
An introduction to multilevel methods , author=. Wavelets, multilevel methods and elliptic PDEs , pages=. 1997 , publisher=
1997
-
[45]
SIAM Journal on Scientific Computing , volume=
Accuracy properties of the wave-ray multigrid algorithm for Helmholtz equations , author=. SIAM Journal on Scientific Computing , volume=. 2006 , publisher=
2006
-
[46]
SIAM Journal on numerical analysis , volume=
Is the pollution effect of the FEM avoidable for the Helmholtz equation considering high wave numbers? , author=. SIAM Journal on numerical analysis , volume=. 1997 , publisher=
1997
-
[47]
Journal of computational Physics , volume=
A dispersion minimizing scheme for the 3-D Helmholtz equation based on ray theory , author=. Journal of computational Physics , volume=. 2016 , publisher=
2016
-
[48]
Advances in Applied Mathematics and Mechanics , volume=
Local Interaction Simulation Approach for the Acoustic Wave Equation with Perfectly Matched Layer , author=. Advances in Applied Mathematics and Mechanics , volume=
-
[49]
Bulletin of the Seismological Society of America , volume=
3D heterogeneous staggered-grid finite-difference modeling of seismic motion with volume harmonic and arithmetic averaging of elastic moduli and densities , author=. Bulletin of the Seismological Society of America , volume=. 2002 , publisher=
2002
-
[50]
Journal of computational physics , volume=
A Helmholtz equation solver using unsupervised learning: Application to transcranial ultrasound , author=. Journal of computational physics , volume=. 2021 , publisher=
2021
-
[51]
Gaussian Error Linear Units (GELUs)
Gaussian error linear units (gelus) , author=. arXiv preprint arXiv:1606.08415 , year=
work page internal anchor Pith review Pith/arXiv arXiv
-
[52]
Advances in Neural Information Processing Systems , volume=
OpenFWI: Large-scale multi-structural benchmark datasets for full waveform inversion , author=. Advances in Neural Information Processing Systems , volume=
-
[53]
Communications on Pure and Applied Mathematics , volume=
Approximate separability of the Green's function of the Helmholtz equation in the high frequency limit , author=. Communications on Pure and Applied Mathematics , volume=. 2018 , publisher=
2018
-
[54]
Frontiers in Physics , volume=
The Green-function transform and wave propagation , author=. Frontiers in Physics , volume=. 2014 , publisher=
2014
-
[55]
Journal of Computational and Applied Mathematics , volume=
Pollution and accuracy of solutions of the Helmholtz equation: A novel perspective from the eigenvalues , author=. Journal of Computational and Applied Mathematics , volume=. 2021 , publisher=
2021
-
[56]
1999 , publisher=
Waves and fields in inhomogenous media , author=. 1999 , publisher=
1999
-
[57]
SIAM Journal on Numerical Analysis , volume=
Wavenumber explicit convergence of a multiscale generalized finite element method for heterogeneous Helmholtz problems , author=. SIAM Journal on Numerical Analysis , volume=. 2023 , publisher=
2023
-
[58]
Multiscale Modeling & Simulation , volume=
Multi-resolution localized orthogonal decomposition for Helmholtz problems , author=. Multiscale Modeling & Simulation , volume=. 2022 , publisher=
2022
-
[59]
SIAM Journal on Scientific Computing , volume=
Super-localized orthogonal decomposition for high-frequency Helmholtz problems , author=. SIAM Journal on Scientific Computing , volume=. 2024 , publisher=
2024
-
[60]
arXiv preprint arXiv:2407.04364 , year=
Robust Multiscale Methods for Helmholtz equations in high contrast heterogeneous media , author=. arXiv preprint arXiv:2407.04364 , year=
-
[61]
I , author=
Variational principles for scattering processes. I , author=. Physical Review , volume=. 1950 , publisher=
1950
-
[62]
SIAM Journal on Numerical Analysis , volume=
Two-level hybrid Schwarz preconditioners for the Helmholtz equation with high wave number , author=. SIAM Journal on Numerical Analysis , volume=. 2025 , publisher=
2025
-
[63]
Computer Methods in Applied Mechanics and Engineering , volume=
Stable multiscale Petrov--Galerkin finite element method for high frequency acoustic scattering , author=. Computer Methods in Applied Mechanics and Engineering , volume=. 2015 , publisher=
2015
-
[64]
Mathematics of Computation , volume=
Eliminating the pollution effect in Helmholtz problems by local subscale correction , author=. Mathematics of Computation , volume=
-
[65]
arXiv preprint arXiv:2511.16808 , year=
Vanka-smoothed shifted Laplacian multigrid preconditioners for the Helmholtz equations , author=. arXiv preprint arXiv:2511.16808 , year=
-
[66]
Journal of Computational Physics , volume=
Block-implicit multigrid solution of Navier-Stokes equations in primitive variables , author=. Journal of Computational Physics , volume=. 1986 , publisher=
1986
-
[67]
Electron
Wave-ray multigrid method for standing wave equations , author=. Electron. Trans. Numer. Anal , volume=
-
[68]
arXiv preprint arXiv:2507.15035 , year=
Openbreastus: Benchmarking neural operators for wave imaging using breast ultrasound computed tomography , author=. arXiv preprint arXiv:2507.15035 , year=
-
[69]
Communications on pure and applied mathematics , volume=
Sweeping preconditioner for the Helmholtz equation: hierarchical matrix representation , author=. Communications on pure and applied mathematics , volume=. 2011 , publisher=
2011
-
[70]
Journal of Computational Physics , volume=
The method of polarized traces for the 2D Helmholtz equation , author=. Journal of Computational Physics , volume=. 2016 , publisher=
2016
-
[71]
SIAM Journal on Scientific Computing , volume=
Optimized Schwarz methods without overlap for the Helmholtz equation , author=. SIAM Journal on Scientific Computing , volume=. 2002 , publisher=
2002
-
[72]
arXiv preprint arXiv:2508.20650 , year=
Self-Composing Neural Operators with Depth and Accuracy Scaling via Adaptive Train-and-Unroll Approach , author=. arXiv preprint arXiv:2508.20650 , year=
-
[73]
Diff-ano: Towards fast high-resolution ultrasound computed tomography via conditional consistency models and adjoint neural operators , author=. arXiv preprint arXiv:2507.16344 , year=
work page internal anchor Pith review Pith/arXiv arXiv
-
[74]
arXiv preprint arXiv:2404.02156 , year=
Convergence of overlapping domain decomposition methods with PML transmission conditions applied to nontrapping Helmholtz problems , author=. arXiv preprint arXiv:2404.02156 , year=
-
[75]
2000 , publisher=
Spectral methods in MATLAB , author=. 2000 , publisher=
2000
-
[76]
SIAM Journal on Scientific Computing , volume=
Generalized hybrid iterative methods for large-scale Bayesian inverse problems , author=. SIAM Journal on Scientific Computing , volume=. 2017 , publisher=
2017
-
[77]
Multiscale Modeling & Simulation , volume=
Generalized multiscale finite element methods for wave propagation in heterogeneous media , author=. Multiscale Modeling & Simulation , volume=. 2014 , publisher=
2014
-
[78]
Neural Preconditioned Born Series: A Metric-Matched Framework for Learning-based Preconditioners
Neural Preconditioned Born Series: A Metric-Matched Framework for Learning-based Preconditioners , author=. arXiv preprint arXiv:2603.18527 , year=
work page internal anchor Pith review Pith/arXiv arXiv
-
[79]
Gasperikova, Erika and Alumbaugh, David and Crandall, Dustin and Commer, Michael and Feng, Shihang and Harbert, William and Li, Yaoguo and Lin, Youzuo and Samarasinghe, Savini Manthila , title =. 2022 , month =. doi:10.18141/1887287 , url =
-
[80]
Machine Learning for Computational Science and Engineering , volume=
Leveraging operator learning to accelerate convergence of the preconditioned conjugate gradient method , author=. Machine Learning for Computational Science and Engineering , volume=. 2025 , publisher=
2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.