PGOT: A Physics-Geometry Operator Transformer for Complex PDEs

Boocheong Khoo; Canqun Yang; Xiaobin Hu; Xi Yang; Yifu Gao; Ying Miao; Yong Yang; Yuan Zhao; Zhuo Zhang

arxiv: 2512.23192 · v3 · submitted 2025-12-29 · 💻 cs.LG

PGOT: A Physics-Geometry Operator Transformer for Complex PDEs

Zhuo Zhang , Xi Yang , Ying Miao , Xiaobin Hu , Yifu Gao , Yuan Zhao , Yong Yang , Canqun Yang

show 1 more author

Boocheong Khoo

This is my paper

Pith reviewed 2026-05-16 19:57 UTC · model grok-4.3

classification 💻 cs.LG

keywords PDE modelingtransformergeometric attentionunstructured meshescomplex geometriesphysics-informedspectrum preservationadaptive computation

0 comments

The pith

PGOT uses spectrum-preserving geometric attention to model PDEs on complex unstructured meshes without losing boundary information to aliasing.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper seeks to fix the loss of critical physical boundary details that occurs when transformers reduce feature dimensions on large unstructured meshes for PDEs. It introduces PGOT, which rebuilds feature learning by explicitly injecting geometry through a new attention module that keeps multi-scale features intact at linear cost. A reader would care because many engineering simulations, from fluid flow around airfoils to structural analysis, depend on accurate handling of irregular boundaries and discontinuities. The architecture also switches between simple linear computation in smooth zones and higher-order nonlinear paths near shocks using local coordinates. This combination aims to deliver both scalability and precision where prior efficient transformers fall short.

Core claim

PGOT reconstructs physical feature learning through explicit geometry awareness via Spectrum-Preserving Geometric Attention. The module applies a physics slicing-geometry injection mechanism to incorporate multi-scale geometric encodings while preserving features and enforcing linear O(N) complexity. Computations are dynamically routed to low-order linear paths in smooth regions and high-order nonlinear paths at shocks and discontinuities according to spatial coordinates.

What carries the argument

Spectrum-Preserving Geometric Attention (SpecGeo-Attention) with a physics slicing-geometry injection mechanism that folds multi-scale geometric encodings into attention to avoid aliasing while retaining O(N) scaling.

If this is right

State-of-the-art accuracy is reached on four standard PDE benchmarks.
Strong results are obtained on large-scale industrial problems such as airfoil and car design.
Spatially adaptive routing improves precision by matching computation order to local field behavior.
Linear complexity supports scaling to meshes too large for quadratic attention.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The slicing-injection pattern could be transferred to other mesh-based tasks such as finite-element analysis in structural mechanics.
Dynamic linear-to-nonlinear routing may reduce overall compute in any simulation that mixes smooth flow with localized shocks.
Extending the same geometry injection to time-dependent or three-dimensional industrial cases would test whether the linear scaling holds at higher resolution.

Load-bearing premise

Injecting geometry via physics slicing into attention preserves multi-scale features and boundary information without creating geometric aliasing.

What would settle it

A side-by-side feature visualization or error map on a fine-boundary unstructured mesh benchmark where PGOT exhibits the same aliasing or boundary loss seen in prior reduced-dimension transformers would disprove the preservation claim.

Figures

Figures reproduced from arXiv: 2512.23192 by Boocheong Khoo, Canqun Yang, Xiaobin Hu, Xi Yang, Yifu Gao, Ying Miao, Yong Yang, Yuan Zhao, Zhuo Zhang.

**Figure 1.** Figure 1: Efficiency and accuracy comparison on standard benchmarks. (a) Inference speed vs. Memory usage. Bubble size indicates model size. (b) Multi-dimensional performance metrics on PDE and industrial datasets. et al., 2023c; Luo et al., 2025; Wu et al., 2024). Recent state-of-the-art methods employ strategies such as low-rank approximations or token clustering to reduce the effective sequence length, achieving … view at source ↗

**Figure 2.** Figure 2: Overall architecture of PGOT. (a) The framework explicitly integrates multi-scale geometry via stacked PhysGeoBlocks to reconstruct velocity and pressure fields on complex 3D meshes. (b) Visualization of TaylorDecomp-FFN. The Linear Expert (blue) captures smooth conservation dynamics, while the Non-linear Expert (red) targets high-order fluctuations. 3.1. Problem Formulation We consider a physical system… view at source ↗

**Figure 3.** Figure 3: Architecture of PhysGeoBlock and SpecGeo-Attention. (a) The PhysGeoBlock integrates explicit geometric coordinates into the SpecGeo-Attention and TaylorDecomp-FFN layers. (b) The “physics slicing-geometry injection” paradigm. A Spectrum Encoder generates geometry-aware weights to aggregate N mesh points into M latent tokens (Slice) and reconstruct them (DeSlice). This design achieves linear complexity O(N)… view at source ↗

**Figure 4.** Figure 4: (right), baselines exhibit scattered high-error points, whereas PGOT maintains consistently low errors, highlighting the importance of multi-scale geometric encoding in preserving structural details [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗

**Figure 5.** Figure 5: Visualization on industrial benchmarks. (a) AirfRANS: ground truth pressure field and prediction errors. (b) Shape-Net Car: ground truth streamlines and PGOT prediction, along with surrounding velocity and surface pressure errors. 7 [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗

**Figure 6.** Figure 6: Learned slice assignments from SpecGeo-Attention on Airfoil (32 slices) [PITH_FULL_IMAGE:figures/full_fig_p015_6.png] view at source ↗

**Figure 7.** Figure 7: Learned slice assignments from SpecGeo-Attention on Pipe (32 slices). 15 [PITH_FULL_IMAGE:figures/full_fig_p015_7.png] view at source ↗

**Figure 8.** Figure 8: Learned slice assignments from SpecGeo-Attention on Plasticity (32 slices) [PITH_FULL_IMAGE:figures/full_fig_p016_8.png] view at source ↗

**Figure 9.** Figure 9: Learned slice assignments from SpecGeo-Attention on Elasticity (32 slices). 16 [PITH_FULL_IMAGE:figures/full_fig_p016_9.png] view at source ↗

**Figure 10.** Figure 10: Learned slice assignments from SpecGeo-Attention on AirfRANS (32 slices) [PITH_FULL_IMAGE:figures/full_fig_p017_10.png] view at source ↗

**Figure 11.** Figure 11: Learned slice assignments from SpecGeo-Attention on Shape-Net Car (32 slices). 17 [PITH_FULL_IMAGE:figures/full_fig_p017_11.png] view at source ↗

**Figure 12.** Figure 12: Gate activations from TaylorDecomp-FFN on Airfoil (64 channels) [PITH_FULL_IMAGE:figures/full_fig_p018_12.png] view at source ↗

**Figure 13.** Figure 13: Gate activations from TaylorDecomp-FFN on Pipe (64 channels). 18 [PITH_FULL_IMAGE:figures/full_fig_p018_13.png] view at source ↗

**Figure 14.** Figure 14: Gate activations from TaylorDecomp-FFN on Plasticity (64 channels). 19 [PITH_FULL_IMAGE:figures/full_fig_p019_14.png] view at source ↗

**Figure 15.** Figure 15: Gate activations from TaylorDecomp-FFN on Elasticity (64 channels). 20 [PITH_FULL_IMAGE:figures/full_fig_p020_15.png] view at source ↗

**Figure 16.** Figure 16: Gate activations from TaylorDecomp-FFN on AirfRANS (64 channels) [PITH_FULL_IMAGE:figures/full_fig_p021_16.png] view at source ↗

**Figure 17.** Figure 17: Gate activations from TaylorDecomp-FFN on Shape-Net Car (64 channels). 21 [PITH_FULL_IMAGE:figures/full_fig_p021_17.png] view at source ↗

read the original abstract

While Transformers have demonstrated remarkable potential in modeling Partial Differential Equations (PDEs), modeling large-scale unstructured meshes with complex geometries remains a significant challenge. Existing efficient architectures often employ feature dimensionality reduction strategies, which inadvertently induces Geometric Aliasing, resulting in the loss of critical physical boundary information. To address this, we propose the Physics-Geometry Operator Transformer (PGOT), designed to reconstruct physical feature learning through explicit geometry awareness. Specifically, we propose Spectrum-Preserving Geometric Attention (SpecGeo-Attention). Utilizing a ``physics slicing-geometry injection" mechanism, this module incorporates multi-scale geometric encodings to explicitly preserve multi-scale geometric features while maintaining linear computational complexity $O(N)$. Furthermore, PGOT dynamically routes computations to low-order linear paths for smooth regions and high-order non-linear paths for shock waves and discontinuities based on spatial coordinates, enabling spatially adaptive and high-precision physical field modeling. PGOT achieves consistent state-of-the-art performance across four standard benchmarks and excels in large-scale industrial tasks including airfoil and car designs.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

PGOT adds a geometry-injection attention and adaptive low/high-order routing for unstructured-mesh PDEs, but the spectrum-preservation claim is asserted without measurement or analysis.

read the letter

The main takeaway is that PGOT introduces SpecGeo-Attention, which injects multi-scale geometric encodings via a physics-slicing mechanism, plus a dynamic router that sends smooth regions to low-order linear paths and discontinuities to high-order nonlinear ones. Both pieces target the aliasing problem that comes from dimension reduction in existing mesh transformers, and the linear O(N) complexity is a practical plus for large industrial cases like airfoils and cars.

Referee Report

3 major / 2 minor

Summary. The paper proposes the Physics-Geometry Operator Transformer (PGOT) to model PDEs on large-scale unstructured meshes with complex geometries. It introduces Spectrum-Preserving Geometric Attention (SpecGeo-Attention) that uses a physics slicing-geometry injection mechanism to incorporate multi-scale geometric encodings while preserving features and maintaining O(N) complexity. A dynamic routing scheme directs computations to low-order linear paths in smooth regions and high-order non-linear paths near discontinuities. The manuscript claims consistent state-of-the-art results on four standard benchmarks plus large-scale industrial tasks such as airfoil and car design.

Significance. If the no-aliasing and performance claims are substantiated, PGOT would offer a scalable architecture for geometry-aware PDE modeling that avoids the feature-reduction pitfalls of prior efficient transformers. The explicit multi-scale injection and spatially adaptive routing address a recognized limitation in applying transformers to industrial-scale unstructured meshes. The work would be of interest to the scientific machine learning community provided the central technical guarantees are demonstrated.

major comments (3)

[Abstract and §3] Abstract and §3: The central claim that SpecGeo-Attention 'successfully preserves multi-scale geometric features' and 'avoids geometric aliasing' is load-bearing for the contribution, yet the spectrum-preserving property is neither formally defined nor verified. No Fourier or eigen-analysis of the geometric encodings is supplied, nor is any quantitative metric (e.g., high-frequency boundary error or aliasing index) reported to confirm preservation of critical physical boundary information.
[§4 and §5] §4 and §5: The headline SOTA performance on four benchmarks and industrial tasks is asserted without the experimental details required to evaluate it. The text supplies no baselines, error bars, statistical significance tests, or ablation studies isolating the contribution of the physics-slicing injection versus the dynamic routing, rendering the performance claim unverifiable from the given material.
[§3.2] §3.2: The assertion of strict O(N) complexity for the multi-scale injection and dynamic high-order routing is not accompanied by a complexity analysis or empirical timing breakdown. The combination of multi-scale encodings and spatially adaptive routing could still incur super-linear costs or frequency folding in practice; this must be shown explicitly to support the scalability claim.

minor comments (2)

[§3] Notation for the dynamic routing thresholds and the precise definition of 'physics slicing' should be introduced with explicit equations rather than descriptive prose.
[§5] Figure captions for the industrial-task visualizations should include quantitative error metrics alongside qualitative plots to allow direct comparison with baselines.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed comments. We address each major point below and will revise the manuscript to incorporate the requested clarifications, analyses, and experimental details.

read point-by-point responses

Referee: [Abstract and §3] The central claim that SpecGeo-Attention 'successfully preserves multi-scale geometric features' and 'avoids geometric aliasing' is load-bearing for the contribution, yet the spectrum-preserving property is neither formally defined nor verified. No Fourier or eigen-analysis of the geometric encodings is supplied, nor is any quantitative metric (e.g., high-frequency boundary error or aliasing index) reported to confirm preservation of critical physical boundary information.

Authors: We agree that a formal definition and explicit verification of the spectrum-preserving property are needed to substantiate the central claim. The SpecGeo-Attention is constructed via physics slicing-geometry injection to avoid dimensionality reduction and thereby preserve multi-scale features by design, but the manuscript lacks the requested formalization and supporting analysis. In revision we will add a precise definition of spectrum preservation in §3, include Fourier and eigen-analysis of the geometric encodings, and report quantitative metrics such as high-frequency boundary error to verify the no-aliasing behavior. revision: yes
Referee: [§4 and §5] The headline SOTA performance on four benchmarks and industrial tasks is asserted without the experimental details required to evaluate it. The text supplies no baselines, error bars, statistical significance tests, or ablation studies isolating the contribution of the physics-slicing injection versus the dynamic routing, rendering the performance claim unverifiable from the given material.

Authors: The experimental sections present performance comparisons, yet we acknowledge that explicit baseline descriptions, error bars from multiple runs, statistical significance tests, and targeted ablations isolating the physics-slicing injection versus dynamic routing are insufficient. We will expand §§4 and 5 to include full baseline specifications, error bars, significance testing, and ablations that separately quantify the contribution of each component. revision: yes
Referee: [§3.2] The assertion of strict O(N) complexity for the multi-scale injection and dynamic high-order routing is not accompanied by a complexity analysis or empirical timing breakdown. The combination of multi-scale encodings and spatially adaptive routing could still incur super-linear costs or frequency folding in practice; this must be shown explicitly to support the scalability claim.

Authors: We will add a rigorous complexity analysis in §3.2 that formally establishes O(N) scaling for both the multi-scale injection and the spatially adaptive routing. We will also include empirical timing breakdowns across the benchmark datasets to confirm practical linear scaling and to rule out super-linear costs or frequency folding. revision: yes

Circularity Check

0 steps flagged

No circularity: novel architecture with empirical claims

full rationale

The paper introduces new components (SpecGeo-Attention, physics slicing-geometry injection, dynamic routing) defined explicitly as design choices rather than derived from or equivalent to prior equations, fitted parameters, or self-citations. Performance claims rest on benchmark results, not on predictions that reduce to inputs by construction. No self-definitional loops, renamed known results, or load-bearing self-citations appear in the derivation chain.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 2 invented entities

The central claim rests on the unverified effectiveness of the new SpecGeo-Attention and dynamic routing mechanisms described at high level; no free parameters, axioms, or invented entities are quantified in the abstract.

axioms (1)

domain assumption Transformers can be extended with explicit geometry injection to avoid aliasing on unstructured meshes
Core premise underlying the proposal of PGOT and SpecGeo-Attention.

invented entities (2)

SpecGeo-Attention no independent evidence
purpose: Reconstruct physical feature learning via multi-scale geometric encodings with linear complexity
New attention module introduced to address geometric aliasing.
Dynamic low-order linear / high-order non-linear routing no independent evidence
purpose: Spatially adaptive computation for smooth regions versus shocks and discontinuities
New routing strategy based on spatial coordinates.

pith-pipeline@v0.9.0 · 5495 in / 1160 out tokens · 32246 ms · 2026-05-16T19:57:50.681403+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Spectrum-Preserving Geometric Attention (SpecGeo-Attention) ... physics slicing-geometry injection mechanism ... multi-scale geometric encodings ... O(N) linear complexity
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

TaylorDecomp-FFN ... low-order linear paths for smooth regions and high-order non-linear paths for shock waves ... spatial gate α(g)

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

25 extracted references · 25 canonical work pages · 1 internal anchor

[1]

Universal physics transformers: A framework for efficiently scaling neural operators

Alkin, B., F ¨urst, A., Schmid, S., Gruber, L., Holzleitner, M., and Brandstetter, J. Universal physics transformers: A framework for efficiently scaling neural operators. InNeurIPS 2024, Vancouver, BC, Canada, December 10 - 15, 2024,

work page 2024
[2]

Choose a transformer: Fourier or galerkin

Cao, S. Choose a transformer: Fourier or galerkin. InNeurIPS 2021, December 6-14, 2021, virtual, pp. 24924–24940,

work page 2021
[3]

Parameterized physics-informed neural networks for parameterized pdes

Cho, W., Jo, M., Lim, H., Lee, K., Lee, D., Hong, S., and Park, N. Parameterized physics-informed neural networks for parameterized pdes. InICML 2024, Vienna, Austria, July 21-27,

work page 2024
[4]

Geometry-guided conditional adaptation for surrogate models of large-scale 3d pdes on arbitrary geometries

Deng, J., Li, X., Xiong, H., Hu, X., and Ma, J. Geometry-guided conditional adaptation for surrogate models of large-scale 3d pdes on arbitrary geometries. InIJCAI 2024, Jeju, South Korea, August 3-9, 2024, pp. 5790–5798. ijcai.org,

work page 2024
[5]

and Ji, S

Gao, H. and Ji, S. Graph u-nets. IniInternational Conference on Machine Learning, pp. 2083–2092. PMLR,

work page 2083
[6]

Efficient token mixing for transformers via adaptive fourier neural operators

Guibas, J., Mardani, M., Li, Z., Tao, A., Anandkumar, A., and Catanzaro, B. Efficient token mixing for transformers via adaptive fourier neural operators. InICLR 2022, Virtual Event, April 25-29,

work page 2022
[7]

Multiwavelet-based operator learning for differential equations

Gupta, G., Xiao, X., and Bogdan, P. Multiwavelet-based operator learning for differential equations. InNeurIPS 2021, December 6-14, 2021, virtual, pp. 24048–24062,

work page 2021
[8]

GNOT: A general neural operator transformer for operator learning

Hao, Z., Wang, Z., Su, H., Ying, C., Dong, Y ., Liu, S., Cheng, Z., Song, J., and Zhu, J. GNOT: A general neural operator transformer for operator learning. InICML 2023, 23-29 July 2023, Honolulu, Hawaii, USA, volume 202 ofProceedings of Machine Learning Research, pp. 12556–12569. PMLR,

work page 2023
[9]

Understanding the expressivity and trainability of fourier neural operator: A mean-field perspective

Koshizuka, T., Fujisawa, M., Tanaka, Y ., and Sato, I. Understanding the expressivity and trainability of fourier neural operator: A mean-field perspective. InNeurIPS 2024, Vancouver, BC, Canada, December 10 - 15, 2024,

work page 2024
[10]

Fnet: Mixing tokens with fourier transforms

Lee-Thorp, J., Ainslie, J., Eckstein, I., and Onta ˜n´on, S. Fnet: Mixing tokens with fourier transforms. InNAACL 2022, Seattle, WA, United States, July 10-15, 2022, pp. 4296–4313. Association for Computational Linguistics,

work page 2022
[11]

Maximal update parametrization and zero-shot hyperparameter transfer for fourier neural operators

Li, S., Yoo, S., and Yang, Y . Maximal update parametrization and zero-shot hyperparameter transfer for fourier neural operators. InICML 2025, Vancouver, BC, Canada, July 13-19,

work page 2025
[12]

Neural Operator: Graph Kernel Network for Partial Differential Equations

22 PGOT: A Physics-Geometry Operator Transformer for Complex PDEs Li, Z., Kovachki, N., Azizzadenesheli, K., Liu, B., Bhattacharya, K., Stuart, A., and Anandkumar, A. Neural operator: Graph kernel network for partial differential equations.arXiv preprint arXiv:2003.03485,

work page internal anchor Pith review Pith/arXiv arXiv 2003
[13]

B., Azizzadenesheli, K., Liu, B., Bhattacharya, K., Stuart, A

Li, Z., Kovachki, N. B., Azizzadenesheli, K., Liu, B., Bhattacharya, K., Stuart, A. M., and Anandkumar, A. Fourier neural operator for parametric partial differential equations. InICLR 2021, Virtual Event, Austria, May 3-7,

work page 2021
[14]

Z., Liu, B., and Anandkumar, A

Li, Z., Huang, D. Z., Liu, B., and Anandkumar, A. Fourier neural operator with learned deformations for pdes on general geometries.Journal of Machine Learning Research, 24(388):1–26, 2023a. Li, Z., Kovachki, N. B., Choy, C. B., Li, B., Kossaifi, J., Otta, S. P., Nabian, M. A., Stadler, M., Hundt, C., Azizzadenesheli, K., and Anandkumar, A. Geometry-inform...

work page 2023
[15]

Transolver++: An accurate neural solver for pdes on million-scale geometries

Luo, H., Wu, H., Zhou, H., Xing, L., Di, Y ., Wang, J., and Long, M. Transolver++: An accurate neural solver for pdes on million-scale geometries. InICML 2025, Vancouver, BC, Canada, July 13-19,

work page 2025
[16]

H., and Shi, B

Morris, E., Shen, H., Du, W., Sajjad, M. H., and Shi, B. Geometric instability of graph neural networks on large graphs. arXiv preprint arXiv:2308.10099,

work page arXiv
[17]

Pfaff, T., Fortunato, M., Sanchez-Gonzalez, A., and Battaglia, P. W. Learning mesh-based simulation with graph networks. InICLR 2021, Virtual Event, Austria, May 3-7,

work page 2021
[18]

A., Ross, Z

Rahman, M. A., Ross, Z. E., and Azizzadenesheli, K. U-NO: u-shaped neural operators.Transactions on Machine Learning Research, 2023,

work page 2023
[19]

Global filter networks for image classification

Rao, Y ., Zhao, W., Zhu, Z., Lu, J., and Zhou, J. Global filter networks for image classification. InNeurIPS 2021, December 6-14, 2021, virtual, pp. 980–993,

work page 2021
[20]

P., Xie, L., and Ong, C

Tran, A., Mathews, A. P., Xie, L., and Ong, C. S. Factorized fourier neural operators. InICLR 2023, Kigali, Rwanda, May 1-5,

work page 2023
[21]

Quanonet: Quantum neural operator with application to differential equation

23 PGOT: A Physics-Geometry Operator Transformer for Complex PDEs Wang, R., Xia, Z., Yan, G., and Yan, J. Quanonet: Quantum neural operator with application to differential equation. In ICML 2025, Vancouver, BC, Canada, July 13-19,

work page 2025
[22]

OpenReview.net, 2025a. Wang, T. and Wang, C. Latent neural operator for solving forward and inverse PDE problems. InNeurIPS 2024, Vancouver, BC, Canada, December 10 - 15, 2024,

work page 2024
[23]

Solving high-dimensional pdes with latent spectral models

Wu, H., Hu, T., Luo, H., Wang, J., and Long, M. Solving high-dimensional pdes with latent spectral models. InICML 2023, 23-29 July 2023, Honolulu, Hawaii, USA, volume 202 ofProceedings of Machine Learning Research, pp. 37417–37438. PMLR,

work page 2023
[24]

Transolver: A fast transformer solver for pdes on general geometries

Wu, H., Luo, H., Wang, H., Wang, J., and Long, M. Transolver: A fast transformer solver for pdes on general geometries. In ICML 2024, Vienna, Austria, July 21-27,

work page 2024
[25]

Improved operator learning by orthogonal attention

Xiao, Z., Hao, Z., Lin, B., Deng, Z., and Su, H. Improved operator learning by orthogonal attention. InICML 2024, Vienna, Austria, July 21-27,

work page 2024

[1] [1]

Universal physics transformers: A framework for efficiently scaling neural operators

Alkin, B., F ¨urst, A., Schmid, S., Gruber, L., Holzleitner, M., and Brandstetter, J. Universal physics transformers: A framework for efficiently scaling neural operators. InNeurIPS 2024, Vancouver, BC, Canada, December 10 - 15, 2024,

work page 2024

[2] [2]

Choose a transformer: Fourier or galerkin

Cao, S. Choose a transformer: Fourier or galerkin. InNeurIPS 2021, December 6-14, 2021, virtual, pp. 24924–24940,

work page 2021

[3] [3]

Parameterized physics-informed neural networks for parameterized pdes

Cho, W., Jo, M., Lim, H., Lee, K., Lee, D., Hong, S., and Park, N. Parameterized physics-informed neural networks for parameterized pdes. InICML 2024, Vienna, Austria, July 21-27,

work page 2024

[4] [4]

Geometry-guided conditional adaptation for surrogate models of large-scale 3d pdes on arbitrary geometries

Deng, J., Li, X., Xiong, H., Hu, X., and Ma, J. Geometry-guided conditional adaptation for surrogate models of large-scale 3d pdes on arbitrary geometries. InIJCAI 2024, Jeju, South Korea, August 3-9, 2024, pp. 5790–5798. ijcai.org,

work page 2024

[5] [5]

and Ji, S

Gao, H. and Ji, S. Graph u-nets. IniInternational Conference on Machine Learning, pp. 2083–2092. PMLR,

work page 2083

[6] [6]

Efficient token mixing for transformers via adaptive fourier neural operators

Guibas, J., Mardani, M., Li, Z., Tao, A., Anandkumar, A., and Catanzaro, B. Efficient token mixing for transformers via adaptive fourier neural operators. InICLR 2022, Virtual Event, April 25-29,

work page 2022

[7] [7]

Multiwavelet-based operator learning for differential equations

Gupta, G., Xiao, X., and Bogdan, P. Multiwavelet-based operator learning for differential equations. InNeurIPS 2021, December 6-14, 2021, virtual, pp. 24048–24062,

work page 2021

[8] [8]

GNOT: A general neural operator transformer for operator learning

Hao, Z., Wang, Z., Su, H., Ying, C., Dong, Y ., Liu, S., Cheng, Z., Song, J., and Zhu, J. GNOT: A general neural operator transformer for operator learning. InICML 2023, 23-29 July 2023, Honolulu, Hawaii, USA, volume 202 ofProceedings of Machine Learning Research, pp. 12556–12569. PMLR,

work page 2023

[9] [9]

Understanding the expressivity and trainability of fourier neural operator: A mean-field perspective

Koshizuka, T., Fujisawa, M., Tanaka, Y ., and Sato, I. Understanding the expressivity and trainability of fourier neural operator: A mean-field perspective. InNeurIPS 2024, Vancouver, BC, Canada, December 10 - 15, 2024,

work page 2024

[10] [10]

Fnet: Mixing tokens with fourier transforms

Lee-Thorp, J., Ainslie, J., Eckstein, I., and Onta ˜n´on, S. Fnet: Mixing tokens with fourier transforms. InNAACL 2022, Seattle, WA, United States, July 10-15, 2022, pp. 4296–4313. Association for Computational Linguistics,

work page 2022

[11] [11]

Maximal update parametrization and zero-shot hyperparameter transfer for fourier neural operators

Li, S., Yoo, S., and Yang, Y . Maximal update parametrization and zero-shot hyperparameter transfer for fourier neural operators. InICML 2025, Vancouver, BC, Canada, July 13-19,

work page 2025

[12] [12]

Neural Operator: Graph Kernel Network for Partial Differential Equations

22 PGOT: A Physics-Geometry Operator Transformer for Complex PDEs Li, Z., Kovachki, N., Azizzadenesheli, K., Liu, B., Bhattacharya, K., Stuart, A., and Anandkumar, A. Neural operator: Graph kernel network for partial differential equations.arXiv preprint arXiv:2003.03485,

work page internal anchor Pith review Pith/arXiv arXiv 2003

[13] [13]

B., Azizzadenesheli, K., Liu, B., Bhattacharya, K., Stuart, A

Li, Z., Kovachki, N. B., Azizzadenesheli, K., Liu, B., Bhattacharya, K., Stuart, A. M., and Anandkumar, A. Fourier neural operator for parametric partial differential equations. InICLR 2021, Virtual Event, Austria, May 3-7,

work page 2021

[14] [14]

Z., Liu, B., and Anandkumar, A

Li, Z., Huang, D. Z., Liu, B., and Anandkumar, A. Fourier neural operator with learned deformations for pdes on general geometries.Journal of Machine Learning Research, 24(388):1–26, 2023a. Li, Z., Kovachki, N. B., Choy, C. B., Li, B., Kossaifi, J., Otta, S. P., Nabian, M. A., Stadler, M., Hundt, C., Azizzadenesheli, K., and Anandkumar, A. Geometry-inform...

work page 2023

[15] [15]

Transolver++: An accurate neural solver for pdes on million-scale geometries

Luo, H., Wu, H., Zhou, H., Xing, L., Di, Y ., Wang, J., and Long, M. Transolver++: An accurate neural solver for pdes on million-scale geometries. InICML 2025, Vancouver, BC, Canada, July 13-19,

work page 2025

[16] [16]

H., and Shi, B

Morris, E., Shen, H., Du, W., Sajjad, M. H., and Shi, B. Geometric instability of graph neural networks on large graphs. arXiv preprint arXiv:2308.10099,

work page arXiv

[17] [17]

Pfaff, T., Fortunato, M., Sanchez-Gonzalez, A., and Battaglia, P. W. Learning mesh-based simulation with graph networks. InICLR 2021, Virtual Event, Austria, May 3-7,

work page 2021

[18] [18]

A., Ross, Z

Rahman, M. A., Ross, Z. E., and Azizzadenesheli, K. U-NO: u-shaped neural operators.Transactions on Machine Learning Research, 2023,

work page 2023

[19] [19]

Global filter networks for image classification

Rao, Y ., Zhao, W., Zhu, Z., Lu, J., and Zhou, J. Global filter networks for image classification. InNeurIPS 2021, December 6-14, 2021, virtual, pp. 980–993,

work page 2021

[20] [20]

P., Xie, L., and Ong, C

Tran, A., Mathews, A. P., Xie, L., and Ong, C. S. Factorized fourier neural operators. InICLR 2023, Kigali, Rwanda, May 1-5,

work page 2023

[21] [21]

Quanonet: Quantum neural operator with application to differential equation

23 PGOT: A Physics-Geometry Operator Transformer for Complex PDEs Wang, R., Xia, Z., Yan, G., and Yan, J. Quanonet: Quantum neural operator with application to differential equation. In ICML 2025, Vancouver, BC, Canada, July 13-19,

work page 2025

[22] [22]

OpenReview.net, 2025a. Wang, T. and Wang, C. Latent neural operator for solving forward and inverse PDE problems. InNeurIPS 2024, Vancouver, BC, Canada, December 10 - 15, 2024,

work page 2024

[23] [23]

Solving high-dimensional pdes with latent spectral models

Wu, H., Hu, T., Luo, H., Wang, J., and Long, M. Solving high-dimensional pdes with latent spectral models. InICML 2023, 23-29 July 2023, Honolulu, Hawaii, USA, volume 202 ofProceedings of Machine Learning Research, pp. 37417–37438. PMLR,

work page 2023

[24] [24]

Transolver: A fast transformer solver for pdes on general geometries

Wu, H., Luo, H., Wang, H., Wang, J., and Long, M. Transolver: A fast transformer solver for pdes on general geometries. In ICML 2024, Vienna, Austria, July 21-27,

work page 2024

[25] [25]

Improved operator learning by orthogonal attention

Xiao, Z., Hao, Z., Lin, B., Deng, Z., and Su, H. Improved operator learning by orthogonal attention. InICML 2024, Vienna, Austria, July 21-27,

work page 2024