pith. sign in

arxiv: 2606.23251 · v1 · pith:V5MQGSZ4new · submitted 2026-06-22 · 💻 cs.CE · cs.LG

Attention mechanism for scalable mesh-based neural surrogates of free-surface fluids

Pith reviewed 2026-06-26 06:08 UTC · model grok-4.3

classification 💻 cs.CE cs.LG
keywords neural surrogateself-attentionPFEMfree-surface flowsmesh discretizationnon-Newtonian fluidsscalable modeling
0
0 comments X

The pith

Self-attention neural surrogates predict free-surface fluid flows on evolving meshes with improved scalability.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper introduces a self-attention architecture to create neural surrogates that approximate the dynamics of free-surface fluid simulations performed with the Particle Finite Element Method. The goal is to achieve faster computations than full PFEM runs while handling evolving geometries and non-Newtonian materials. By operating directly on the mesh nodes, the model maintains the discretization needed for accurate long-term predictions and for calculating derived quantities like stresses using existing finite element tools. Readers interested in engineering simulations would care because it offers a way to run many scenarios or longer times without the full computational burden of traditional methods.

Core claim

The self-attention-based neural surrogate accurately predicts transient dynamics and final configurations of free-surface flows with significantly improved scalability while preserving the PFEM mesh discretization and enabling reconstruction of derived mechanical quantities.

What carries the argument

Self-attention layers that model interactions between nodes on the PFEM mesh to capture spatial dependencies in evolving free-surface flows.

If this is right

  • Accurate predictions hold for two- and three-dimensional benchmarks with varying material parameters and non-Newtonian fluids.
  • The linear attention variant reduces computational cost and improves scalability over standard self-attention.
  • Mesh preservation allows direct reconstruction of stress fields and other mechanical quantities via standard finite element operators.
  • Improved long-term stability during rollouts on changing geometries without additional remeshing rules.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Such surrogates might be combined with optimization loops for design of fluid-handling structures.
  • The approach could apply to other mesh-based Lagrangian simulations beyond free-surface flows.
  • Future work might test generalization to unseen rheologies or larger domains not in the training set.

Load-bearing premise

Attention layers reliably capture spatial dependencies for stable long-term rollouts on arbitrary evolving meshes without needing extra constraints.

What would settle it

Running the surrogate on a long-duration 3D simulation of a non-Newtonian free-surface flow and comparing the predicted mesh evolution and quantities against a full PFEM reference solution to check for accumulating errors or instability.

Figures

Figures reproduced from arXiv: 2606.23251 by Federico Lanteri, Massimiliano Cremonesi.

Figure 1
Figure 1. Figure 1: A scheme of NeuralPFEM framework during the prediction phase. [PITH_FULL_IMAGE:figures/full_fig_p005_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: A schematic of the GNN architecture during the prediction phase. The model leverages mesh connectivity to [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Standard self-attention architecture during the prediction phase. Here, [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Linear self-attention architecture during the prediction phase. By removing the softmax nonlinearity, attention [PITH_FULL_IMAGE:figures/full_fig_p009_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: 2D inclined plane benchmark: initial configuration (left) and final equilibrium state (right). [PITH_FULL_IMAGE:figures/full_fig_p012_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Temporal snapshots of NeuralPFEM with standard self-attention predictions (top row of each pair) and [PITH_FULL_IMAGE:figures/full_fig_p013_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Comparison of the normalized runout evolution over time obtained with PFEM ( [PITH_FULL_IMAGE:figures/full_fig_p014_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: 3D bingham cone slump experiment: initial configuration (left) and final equilibrium state (right). [PITH_FULL_IMAGE:figures/full_fig_p016_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Temporal snapshots of NeuralPFEM with standard self-attention predictions (top row of each pair) and [PITH_FULL_IMAGE:figures/full_fig_p018_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: 3D casting experiment: the fluid, initially suspended above the container, is released and flows under gravity [PITH_FULL_IMAGE:figures/full_fig_p019_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Temporal snapshots of NeuralPFEM with linear self-attention predictions and reference PFEM solutions [PITH_FULL_IMAGE:figures/full_fig_p021_11.png] view at source ↗
read the original abstract

High-fidelity simulations of free-surface flows using Lagrangian methods such as the Particle Finite Element Method (PFEM) are computationally demanding due to continuous domain updates and repeated solution of the governing equations. This challenge is further amplified by non-Newtonian rheologies, where material nonlinearities increase computational cost. These limitations motivate the development of efficient surrogate models to approximate PFEM dynamics at reduced cost. While data-driven deep learning approaches are promising, a key challenge is designing models that operate on arbitrary and evolving geometries. We propose a self-attention-based neural surrogate for PFEM simulations of free-surface flows. The architecture leverages attention mechanisms to model node interactions and capture complex spatial dependencies, while preserving the PFEM mesh discretization. This provides a geometric and topological framework for remeshing and node redistribution, maintaining high-quality spatial discretization during rollouts, improving long-term stability, and enabling reconstruction of derived mechanical quantities via standard finite element operators. Two attention formulations are considered: a standard self-attention mechanism and a linear variant that reduces computational cost and improves scalability. The models are evaluated on two- and three-dimensional free-surface flow benchmarks with evolving geometries, varying material parameters, and non-Newtonian fluids. Results show accurate prediction of transient dynamics and final configurations, with significantly improved scalability. The mesh-based formulation also enables direct reconstruction of quantities such as stress fields. Overall, the framework provides an accurate and scalable surrogate strategy for PFEM simulations in engineering-scale applications.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript proposes a self-attention-based neural surrogate for Particle Finite Element Method (PFEM) simulations of free-surface flows. It employs standard and linear self-attention layers to model node interactions on arbitrary evolving meshes, preserving the PFEM discretization to support remeshing, long-term stability, and reconstruction of derived mechanical quantities via standard FE operators. The models are evaluated on 2D and 3D benchmarks involving varying material parameters and non-Newtonian fluids, with claims of accurate transient and final-state predictions plus significantly improved scalability.

Significance. If the quantitative claims hold, the work would supply a practical mesh-preserving surrogate that reduces the cost of repeated PFEM solves while retaining the ability to apply standard remeshing and post-process stress/strain fields, which is a concrete advantage over purely particle- or grid-based neural surrogates for engineering free-surface problems.

major comments (2)
  1. [Abstract] Abstract and evaluation description: the central claim that the surrogate 'accurately predicts transient dynamics and final configurations' and 'significantly improved scalability' is asserted without any reported error metrics (e.g., L2 velocity or interface error), baseline comparisons against PFEM or other surrogates, or validation protocols for long-horizon rollouts. This absence directly undermines assessment of the accuracy and stability assertions.
  2. [Architecture / Model description] The architecture description does not specify any explicit mechanism (regularization, loss term, or architectural constraint) that would enforce mesh-quality metrics, volume preservation, or divergence-free conditions on the evolving node sets. Because the skeptic correctly notes that standard self-attention supplies no such guarantees, the claim that the model 'maintains high-quality spatial discretization during rollouts' without post-hoc rules remains unsecured.
minor comments (1)
  1. [Abstract] The abstract is unusually long and contains several forward-looking claims that would be better placed in the conclusions or results summary.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive report. We address each major comment below.

read point-by-point responses
  1. Referee: [Abstract] Abstract and evaluation description: the central claim that the surrogate 'accurately predicts transient dynamics and final configurations' and 'significantly improved scalability' is asserted without any reported error metrics (e.g., L2 velocity or interface error), baseline comparisons against PFEM or other surrogates, or validation protocols for long-horizon rollouts. This absence directly undermines assessment of the accuracy and stability assertions.

    Authors: The abstract is a high-level summary. Quantitative error metrics (L2 norms on velocity and interface position), direct PFEM baseline comparisons, and long-horizon rollout statistics are reported in Sections 4 and 5 of the manuscript. We will revise the abstract to include representative quantitative values and a brief statement on the validation protocol. revision: yes

  2. Referee: [Architecture / Model description] The architecture description does not specify any explicit mechanism (regularization, loss term, or architectural constraint) that would enforce mesh-quality metrics, volume preservation, or divergence-free conditions on the evolving node sets. Because the skeptic correctly notes that standard self-attention supplies no such guarantees, the claim that the model 'maintains high-quality spatial discretization during rollouts' without post-hoc rules remains unsecured.

    Authors: The surrogate is formulated to output updated nodal positions and velocities on the existing PFEM mesh connectivity. This design choice deliberately retains the PFEM discretization so that the standard PFEM remeshing and node-redistribution algorithms can be applied after each surrogate step exactly as in the original solver. No additional regularization terms for volume or divergence are present in the loss; the network is trained to reproduce the PFEM trajectories, which already satisfy these constraints. The claim of maintained discretization quality therefore rests on the ability to invoke the existing PFEM remeshing machinery rather than on an internal architectural guarantee. revision: partial

Circularity Check

0 steps flagged

No significant circularity; standard data-driven surrogate

full rationale

The paper describes a neural surrogate architecture (self-attention or linear attention) trained on PFEM simulation trajectories to predict node positions and velocities on evolving meshes. No load-bearing equations, uniqueness theorems, or ansatzes are presented that reduce a claimed prediction to a fitted input or self-citation by construction. Evaluation relies on empirical rollout accuracy against held-out PFEM runs; the mesh-based output enables post-hoc FE operators but does not redefine or tautologically derive any mechanical quantity. This is the expected non-finding for a purely empirical ML modeling paper.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract supplies no explicit free parameters, axioms, or invented entities; the model is described as a standard neural architecture trained on simulation data.

pith-pipeline@v0.9.1-grok · 5789 in / 1102 out tokens · 35700 ms · 2026-06-26T06:08:25.231378+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

56 extracted references · 47 canonical work pages · 5 internal anchors

  1. [1]

    Idelsohn, E

    S. Idelsohn, E. Oñate, F. D. Pin, The particle finite element method: A powerful tool to solve incompressible flows with free-surfaces and breaking waves, International Journal for Numerical Methods in Engineering 61 (7) (2004) 964–989.doi:10.1002/nme.1096

  2. [2]

    Cremonesi, A

    M. Cremonesi, A. Franci, S. Idelsohn, E. Oñate, A State of the Art Review of the Particle Finite Element Method (PFEM), Arch Computat Methods Eng 27 (5) (2020) 1709–1735.doi:10.1007/s11831-020-09468-4

  3. [3]

    Franci, M

    A. Franci, M. Cremonesi, U. Perego, G. Crosta, E. Oñate, 3D simulation of Vajont disaster. Part 1: Numerical formulation and validation, Engineering Geology 279 (2020) 105854. doi:10.1016/j.enggeo.2020.105854

  4. [4]

    J. M. Carbonell, L. Monforte, M. O. Ciantia, M. Arroyo, A. Gens, Geotechnical particle finite element method for modeling of soil-structure interaction under large deformation conditions, Journal of Rock Mechanics and Geotechnical Engineering 14 (3) (2022) 967–983.doi:10.1016/j.jrmge.2021.12.006

  5. [5]

    Rizzieri, L

    G. Rizzieri, L. Ferrara, M. Cremonesi, Simulation of viscoelastic free-surface flows with the Particle Finite Element Method, Comp. Part. Mech. 11 (5) (2024) 2043–2067.doi:10.1007/s40571-024-00730-1. 22 Attention mechanism for scalable mesh-based neural surrogates of free-surface fluids

  6. [6]

    Rizzieri, L

    G. Rizzieri, L. Ferrara, M. Cremonesi, A partitioned Lagrangian finite element approach for the simulation of viscoelastic and elasto-viscoplastic free-surface flows, Computer Methods in Applied Mechanics and Engineering 443 (2025) 118071.doi:10.1016/j.cma.2025.118071

  7. [7]

    Idelsohn, M

    S. Idelsohn, M. Mier-Torrecilla, E. Oñate, Multi-fluid flows with the Particle Finite Element Method, Computer Methods in Applied Mechanics and Engineering 198 (33) (2009) 2750–2767. doi:10.1016/j.cma.2009.04. 002

  8. [8]

    Meduri, M

    S. Meduri, M. Cremonesi, U. Perego, O. Bettinotti, A. Kurkchubasche, V . Oancea, A partitioned fully explicit Lagrangian finite element method for highly nonlinear fluid-structure interaction problems, International Journal for Numerical Methods in Engineering 113 (1) (2018) 43–64.doi:10.1002/nme.5602

  9. [10]

    C. Fu, M. Cremonesi, U. Perego, A hybrid Lagrangian–Eulerian particle finite element method for free-surface and fluid–structure interaction problems, International Journal for Numerical Methods in Engineering 125 (5) (2024) e7402.doi:10.1002/nme.7402

  10. [11]

    P. B. Ryzhakov, J. García, E. Oñate, Lagrangian finite element model for the 3D simulation of glass forming processes, Computers & Structures 177 (2016) 126–140.doi:10.1016/j.compstruc.2016.09.007

  11. [12]

    Rizzieri, L

    G. Rizzieri, L. Ferrara, M. Cremonesi, Numerical simulation of the extrusion and layer deposition processes in 3D concrete printing with the Particle Finite Element Method, Comput Mech 73 (2) (2024) 277–295. doi: 10.1007/s00466-023-02367-y

  12. [13]

    Leyssens, M

    T. Leyssens, M. Henry, J. Lambrechts, J.-F. Remacle, A Delaunay refinement algorithm for the particle finite element method applied to free surface flows, International Journal for Numerical Methods in Engineering 125 (18) (2024) e7554.doi:10.1002/nme.7554

  13. [14]

    Quarteroni, A

    A. Quarteroni, A. Manzoni, F. Negri, Reduced Basis Methods for Partial Differential Equations, V ol. 92 of UNITEXT, Springer International Publishing, Cham, 2016.doi:10.1007/978-3-319-15431-2

  14. [15]

    Beckermann, R

    M. Beckermann, R. Scanff, M. Cremonesi, A. Barbarulo, A new strategy using the Proper Generalized De- composition to model time evolving spatial domains, Computers & Structures 316 (2025) 107860. doi: 10.1016/j.compstruc.2025.107860

  15. [16]

    Brivio, S

    S. Brivio, S. Fresca, A. Manzoni, Handling geometrical variability in nonlinear reduced order modeling through Continuous Geometry-Aware DL-ROMs, Computer Methods in Applied Mechanics and Engineering 442 (2025) 117989.doi:10.1016/j.cma.2025.117989

  16. [17]

    Tierz, I

    A. Tierz, I. Alfaro, D. González, F. Chinesta, E. Cueto, Graph neural networks informed locally by thermodynamics, Engineering Applications of Artificial Intelligence 144 (2025) 110108. doi:10.1016/j.engappai.2025. 110108

  17. [18]

    Sharma, O

    V . Sharma, O. Fink, A physics-informed graph neural network conserving linear and angular momentum for dynamical systems, Nat Commun 17 (1) (2026) 1045.doi:10.1038/s41467-025-67802-5

  18. [19]

    Battaglia

    A. Sanchez-Gonzalez, J. Godwin, T. Pfaff, R. Ying, J. Leskovec, P. W. Battaglia, Learning to Simulate Complex Physics with Graph Networks (Sep. 2020).arXiv:2002.09405,doi:10.48550/arXiv.2002.09405

  19. [20]

    Z. Li, A. B. Farimani, Graph neural network-accelerated Lagrangian fluid simulation, Computers & Graphics 103 (2022) 201–211.doi:10.1016/j.cag.2022.02.004

  20. [21]

    S. Zhao, H. Chen, J. Zhao, A physical-information-flow-constrained temporal graph neural network-based simulator for granular materials, Computer Methods in Applied Mechanics and Engineering 433 (2025) 117536. doi:10.1016/j.cma.2024.117536

  21. [22]

    Y . Choi, K. Kumar, Graph Neural Network-based surrogate model for granular flows, Computers and Geotechnics 166 (2024) 106015.doi:10.1016/j.compgeo.2023.106015

  22. [23]

    Z. Li, K. Meidani, P. Yadav, A. Barati Farimani, Graph neural networks accelerated molecular dynamics, J. Chem. Phys. 156 (14) (2022) 144103.doi:10.1063/5.0083060

  23. [24]

    Com- puter Methods in Applied Mechanics and Engineering449, 118476 (2026)

    L. Tesán, M. M. Iparraguirre, D. González, P. Martins, E. Cueto, On the under-reaching phenomenon in message passing neural PDE solvers: Revisiting the CFL condition, Computer Methods in Applied Mechanics and Engineering 449 (2026) 118476.doi:10.1016/j.cma.2025.118476

  24. [25]

    Lanteri, M

    F. Lanteri, M. Cremonesi, A mesh-based Graph Neural Network approach for surrogate modeling of Lagrangian free surface fluid flows, Computers & Fluids 301 (2025) 106773.doi:10.1016/j.compfluid.2025.106773. 23 Attention mechanism for scalable mesh-based neural surrogates of free-surface fluids

  25. [26]

    Vaswani, N

    A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. ukasz Kaiser, I. Polosukhin, Attention is All you Need, in: Advances in Neural Information Processing Systems, V ol. 30, Curran Associates, Inc., 2017

  26. [27]

    An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

    A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, N. Houlsby, An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale (Jun. 2021).arXiv:2010.11929,doi:10.48550/arXiv.2010.11929

  27. [28]

    Jumper, R

    J. Jumper, R. Evans, A. Pritzel, T. Green, M. Figurnov, O. Ronneberger, K. Tunyasuvunakool, R. Bates, A. Žídek, A. Potapenko, A. Bridgland, C. Meyer, S. A. A. Kohl, A. J. Ballard, A. Cowie, B. Romera-Paredes, S. Nikolov, R. Jain, J. Adler, T. Back, S. Petersen, D. Reiman, E. Clancy, M. Zielinski, M. Steinegger, M. Pacholska, T. Berghammer, S. Bodenstein, ...

  28. [29]

    Z. Li, K. Meidani, A. B. Farimani, Transformer for Partial Differential Equations’ Operator Learning (Apr. 2023). arXiv:2205.13671,doi:10.48550/arXiv.2205.13671

  29. [30]

    H. Wu, H. Luo, H. Wang, J. Wang, M. Long, Transolver: A Fast Transformer Solver for PDEs on General Geometries (Jun. 2024).arXiv:2402.02366,doi:10.48550/arXiv.2402.02366

  30. [31]

    GeoTransolver: Learning Physics on Irregular Domains Using Multi-scale Geometry Aware Physics Attention Transformer

    C. Adams, R. Ranade, R. Cherukuri, S. Choudhry, GeoTransolver: Learning Physics on Irregular Domains Using Multi-scale Geometry Aware Physics Attention Transformer (Dec. 2025). arXiv:2512.20399, doi: 10.48550/arXiv.2512.20399

  31. [32]

    M. M. Iparraguirre, I. Alfaro, D. Gonzalez, E. Cueto, MeshGraphNet-Transformer: Scalable Mesh-based Learned Simulation for Solid Mechanics (Feb. 2026).arXiv:2601.23177,doi:10.48550/arXiv.2601.23177

  32. [33]

    N. Wang, S. Zheng, Y . Chen, H. Zhao, Z. Fang, FluidFormer : Transformer with continuous convolution for particle-based fluid simulation, Neural Networks 198 (2026) 108631. doi:10.1016/j.neunet.2026.108631

  33. [34]

    Alkin, M

    B. Alkin, M. Bleeker, R. Kurle, T. Kronlachner, R. Sonnleitner, M. Dorfer, J. Brandstetter, AB-UPT: Scaling Neural CFD Surrogates for High-Fidelity Automotive Aerodynamics Simulations via Anchored-Branched Universal Physics Transformers (Oct. 2025).arXiv:2502.09692,doi:10.48550/arXiv.2502.09692

  34. [35]

    Saberi, A

    M. Saberi, A. B. Farimani, S. Jamali, RheOFormer: A generative transformer model for simulation of complex fluids and flows (Oct. 2025).arXiv:2510.01365,doi:10.48550/arXiv.2510.01365

  35. [36]

    T. Dao, D. Fu, S. Ermon, A. Rudra, C. Ré, FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness, Advances in Neural Information Processing Systems 35 (2022) 16344–16359

  36. [37]

    H. Zhou, H. Wu, H. Shangguan, Y . Ma, H. Weng, J. Wang, M. Long, Transolver-3: Scaling Up Transformer Solvers to Industrial-Scale Geometries (Feb. 2026). arXiv:2602.04940, doi:10.48550/arXiv.2602.04940

  37. [38]

    Alkin, T

    B. Alkin, T. Kronlachner, S. Papa, S. Pirker, T. Lichtenegger, J. Brandstetter, NeuralDEM – Real-time Simulation of Industrial Particulate Flows (Feb. 2025).arXiv:2411.09678,doi:10.48550/arXiv.2411.09678

  38. [39]

    Katharopoulos, A

    A. Katharopoulos, A. Vyas, N. Pappas, F. Fleuret, Transformers are RNNs: Fast autoregressive transformers with linear attention, in: Proceedings of the 37th International Conference on Machine Learning, V ol. 119 of ICML’20, JMLR.org, 2020, pp. 5156–5165

  39. [40]

    In: IEEE Winter Conference on Applications of Computer Vision (WACV)

    S. Zhuoran, Z. Mingyuan, Z. Haiyu, Y . Shuai, L. Hongsheng, Efficient Attention: Attention with Linear Com- plexities, in: 2021 IEEE Winter Conference on Applications of Computer Vision (W ACV), 2021, pp. 3530–3538. doi:10.1109/WACV48630.2021.00357

  40. [41]

    T. C. Papanastasiou, Flows of materials with yield, J. Rheol. 31 (5) (1987) 385–404.doi:10.1122/1.549926

  41. [42]

    T. J. R. Hughes, L. P. Franca, M. Balestra, A new finite element formulation for computational fluid dynamics: V. Circumventing the babuška-brezzi condition: A stable Petrov-Galerkin formulation of the stokes problem accommodating equal-order interpolations, Computer Methods in Applied Mechanics and Engineering 59 (1) (1986) 85–99.doi:10.1016/0045-7825(86)90025-3

  42. [43]

    Oñate, S

    E. Oñate, S. R. Idelsohn, F. Del Pin, R. Aubry, The particle finite element method — an overview, Int. J. Comput. Methods 01 (02) (2004) 267–307.doi:10.1142/S0219876204000204

  43. [44]

    Edelsbrunner, E

    H. Edelsbrunner, E. P. Mücke, Three-dimensional alpha shapes, in: Proceedings of the 1992 Workshop on V olume Visualization, VVS ’92, Association for Computing Machinery, New York, NY , USA, 1992, pp. 75–82. doi:10.1145/147130.147153

  44. [45]

    Meduri, M

    S. Meduri, M. Cremonesi, U. Perego, An efficient runtime mesh smoothing technique for 3D explicit Lagrangian free-surface fluid flow simulations, International Journal for Numerical Methods in Engineering 117 (4) (2019) 430–452.doi:10.1002/nme.5962. 24 Attention mechanism for scalable mesh-based neural surrogates of free-surface fluids

  45. [46]

    H. Zhou, S. Cheng, Improving long-term autoregressive spatiotemporal predictions: A proof of concept with fluid dynamics, Computer Methods in Applied Mechanics and Engineering 447 (2025) 118332. doi:10.1016/j.cma. 2025.118332

  46. [47]

    McCabe, P

    M. McCabe, P. Harrington, S. Subramanian, J. Brown, Towards Stability of Autoregressive Neural Operators, Transactions on Machine Learning Research (Jun. 2023)

  47. [48]

    IEEE Transactions on Neural Networks20(1), 61–80 (2008)

    F. Scarselli, M. Gori, A. C. Tsoi, M. Hagenbuchner, G. Monfardini, The Graph Neural Network Model, IEEE Transactions on Neural Networks 20 (1) (2009) 61–80.doi:10.1109/TNN.2008.2005605

  48. [49]

    P. W. Battaglia, J. B. Hamrick, V . Bapst, A. Sanchez-Gonzalez, V . Zambaldi, M. Malinowski, A. Tacchetti, D. Raposo, A. Santoro, R. Faulkner, C. Gulcehre, F. Song, A. Ballard, J. Gilmer, G. Dahl, A. Vaswani, K. Allen, C. Nash, V . Langston, C. Dyer, N. Heess, D. Wierstra, P. Kohli, M. Botvinick, O. Vinyals, Y . Li, R. Pascanu, Relational inductive biases...

  49. [51]

    T. K. Rusch, M. M. Bronstein, S. Mishra, A Survey on Oversmoothing in Graph Neural Networks (Mar. 2023). arXiv:2303.10993,doi:10.48550/arXiv.2303.10993

  50. [52]

    Z. Li, N. B. Kovachki, C. Choy, B. Li, J. Kossaifi, S. P. Otta, M. A. Nabian, M. Stadler, C. Hundt, K. Azizzade- nesheli, A. Anandkumar, Geometry-informed neural operator for large-scale 3D PDEs, in: Proceedings of the 37th International Conference on Neural Information Processing Systems, NIPS ’23, Curran Associates Inc., Red Hook, NY , USA, 2023, pp. 35...

  51. [53]

    C. K. Joshi, Transformers are Graph Neural Networks (Jun. 2025). arXiv:2506.22084, doi:10.48550/arXiv. 2506.22084

  52. [54]

    Cao, Choose a Transformer: Fourier or Galerkin, in: Advances in Neural Information Processing Systems, V ol

    S. Cao, Choose a Transformer: Fourier or Galerkin, in: Advances in Neural Information Processing Systems, V ol. 34, Curran Associates, Inc., 2021, pp. 24924–24940

  53. [55]

    J. Su, M. Ahmed, Y . Lu, S. Pan, W. Bo, Y . Liu, RoFormer: Enhanced transformer with Rotary Position Embedding, Neurocomputing 568 (2024) 127063.doi:10.1016/j.neucom.2023.127063

  54. [56]

    FiLM: Visual Reasoning with a General Conditioning Layer

    E. Perez, F. Strub, H. de Vries, V . Dumoulin, A. Courville, FiLM: Visual Reasoning with a General Conditioning Layer (Dec. 2017).arXiv:1709.07871,doi:10.48550/arXiv.1709.07871

  55. [57]

    Rizzieri, F

    G. Rizzieri, F. Lanteri, L. Ferrara, M. Cremonesi,ShapeGen3DCP: A deep learning framework for layer shape prediction in 3D concrete printing, Computers & Structures 323 (2026) 108142. doi:10.1016/j.compstruc. 2026.108142

  56. [58]

    S. Wen, A. Kumbhat, L. Lingsch, S. Mousavi, Y . Zhao, P. Chandrashekar, S. Mishra, Geometry Aware Operator Transformer as an efficient and accurate neural surrogate for PDEs on arbitrary domains, Advances in Neural Information Processing Systems 38 (2026) 155423–155501. 25