pith. sign in

arxiv: 2606.17180 · v1 · pith:2DDJOHSBnew · submitted 2026-06-15 · 💻 cs.LG

Towards Fast GNN Surrogates for CO2 Migration in Complex Geological Formations

Pith reviewed 2026-06-27 03:54 UTC · model grok-4.3

classification 💻 cs.LG
keywords graph neural networksCO2 storagesurrogate modelingmultiphase flowgeological formationsSPE11A benchmarkanisotropic message passingautoregressive forecasting
0
0 comments X

The pith

A graph neural network surrogate forecasts CO2 plume migration by representing the geological grid as a transmissibility graph and applying geometry-conditioned anisotropic message passing.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a data-driven graph neural network to predict how injected CO2 moves through complex underground rock formations used for long-term storage. It converts the standard SPE11A test case into a graph where each computational cell is a node and flow interactions become edges that carry geometric and transmissibility details. An anisotropic message-passing layer then lets information travel preferentially along physically plausible directions, while an autoregressive residual model advances the state in latent space over multiple time steps. The resulting forecasts match reference solutions for gas saturation and liquid density with errors that grow only moderately across long prediction windows. This suggests the approach could replace slower traditional simulators for repeated monitoring calculations in CO2 storage projects.

Core claim

The proposed end-to-end graph neural surrogate, built on a transmissibility-enriched graph, geometry-conditioned anisotropic message passing, and autoregressive residual training with multi-step supervision, produces competitive forecasts of gas saturation and liquid-phase density on the SPE11A benchmark, with cumulative errors that remain moderate over extended forecasting horizons.

What carries the argument

Geometry-conditioned anisotropic message passing on a transmissibility graph, where edge embeddings bias aggregation toward physically relevant transport directions, combined with latent-space autoregressive residual prediction.

If this is right

  • Directional transport arising from grid geometry and permeability contrasts is captured without hand-crafted physics terms.
  • Sharp gas-water interfaces and convective mixing develop naturally from the learned message-passing rules.
  • Key monitoring quantities remain usable for decision support over forecasting windows long enough for practical storage operations.
  • The same graph construction and training recipe can be applied to other geological models that share the same benchmark structure.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If the graph surrogate generalizes across different injection rates or formation properties, it could support ensemble simulations for risk assessment at far lower cost than full-physics runs.
  • The absence of explicit conservation laws may limit reliability when the model is queried far outside the training distribution of geological heterogeneity.
  • Similar graph encodings of transmissibility could be tested on related porous-media problems such as groundwater contaminant transport.

Load-bearing premise

Reformulating the geological domain as a graph whose edges encode transmissibility and geometry, together with anisotropic message passing, is sufficient to reproduce essential multiphase flow features such as sharp interfaces and fingering without any explicit physics equations or regularization.

What would settle it

Running the trained model on the SPE11A benchmark for time horizons beyond those reported and finding that cumulative error in gas saturation grows rapidly or that fingering patterns are absent or misplaced compared with the reference solution.

Figures

Figures reproduced from arXiv: 2606.17180 by Adriano M. A. Cortes, Alexandre G. Evsukoff, Alvaro L. G. A. Coutinho, Fernando A. Rochinha, Herve Gross, Luiz S. L. Neto, Mauricio Araya-Polo, Renato N. Elias, Roberto M. Velho, Rodrigo S. Luna, Thiago H. N. Coelho.

Figure 1
Figure 1. Figure 1: SPE11A grid resolution comparison. Left: liquid-phase density computed using [PITH_FULL_IMAGE:figures/full_fig_p007_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Comparison of SPE11A sparse quantities computed with grid resolutions of [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Liquid-phase density at time 𝑡 = 120 hours for three trajectories (a)-(c), out of ten with randomly generated permeabilities and porosities for the seven facies, and (d) the SPE11A benchmark ground truth. All results are computed with the 2 cm resolution grid. 𝑡 ∈ (18 × 103 , 432 × 103 ]s (time steps 201–4800) corresponds to a no-injection regime, leading to a strong imbalance between the injection and no-… view at source ↗
Figure 4
Figure 4. Figure 4: Left: Mesh for the SPE11A benchmark with 2 cm resolution colored by the CO2 [PITH_FULL_IMAGE:figures/full_fig_p010_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Prediction at time 𝑡 = 900s (starting inference from 𝑡 = 0s) using the model trained during both regimes. Top, gas saturation; bottom, liquid phase density (𝑘𝑔 𝑚−3 ). Note the large absolute errors at early injection time, reflecting the rapid and difficult￾to-predict initial CO2 spreading [PITH_FULL_IMAGE:figures/full_fig_p014_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Prediction at time 𝑡 = 23,400s (starting inference from 𝑡 = 22,500s) using the model trained during both regimes. Top, gas saturation; bottom, liquid phase density (𝑘𝑔 𝑚−3 ). The absolute error is spatially localized at plume fronts and finger tips, where nonlinear advection dominates. forecasts under both injection and post-injection regimes, with low error growth and good generalization to unseen geologi… view at source ↗
Figure 7
Figure 7. Figure 7: Liquid-density prediction and absolute error produced by the baseline model at [PITH_FULL_IMAGE:figures/full_fig_p015_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Prediction at time 𝑡 = 225,900s (starting inference from 𝑡 = 225,000s) using the model trained during both regimes. Top, gas saturation; bottom, liquid phase density (𝑘𝑔 𝑚−3 ). The model captures the complex fingering structure with good fidelity in the intermediate dissolution regime. a realistic domain with coupled flow and geomechanics remains open. Furthermore, forecasting longer rollouts remains chall… view at source ↗
Figure 9
Figure 9. Figure 9: Prediction at time 𝑡 = 405,900s (starting inference from 𝑡 = 405,000s), using the model trained during both regimes. Top, gas saturation; bottom, liquid phase den￾sity (𝑘𝑔 𝑚−3 ). At late times, when dissolution dominates, predictions remain in close agreement with the ground truth across the full domain. Conflict of Interest Statement The authors have no conflicts of interest to declare that are relevant t… view at source ↗
read the original abstract

This chapter discusses how a data-driven machine learning approach can reproduce key aspects of the physical behavior of multiphase flows in complex geological formations. We propose an end-to-end graph neural surrogate tailored to CO$_2$ plume migration forecasting in geological storage. The method is evaluated on the SPE11A benchmark, a well-known industry test case designed to assess CO$_2$ storage scenarios and characterized by sharp gas-water interfaces, strong advective transport, and rapid convective mixing with fingering development. The benchmark is reformulated as a graph in which nodes represent computational cells and edges encode transmissibility-based interactions enriched with geometric attributes. Directional transport arising from grid geometry, permeability contrasts, and geological heterogeneity is captured through an anisotropic message-passing mechanism, where interaction weights are computed via geometry-conditioned edge embeddings, biasing message aggregation toward physically relevant transport directions. Temporal evolution is modeled in latent space using an autoregressive residual formulation trained with multi-step supervision. The proposed model produces competitive forecasts of gas saturation and liquid-phase density, which are key indicators for CO$_2$ storage monitoring, with cumulative errors that remain moderate over extended forecasting horizons.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper introduces an end-to-end graph neural network (GNN) surrogate for CO2 plume migration forecasting in geological storage. The SPE11A benchmark is reformulated as a graph with nodes representing computational cells and edges encoding transmissibility-based interactions enriched with geometric attributes. Directional transport is captured via geometry-conditioned anisotropic message passing, and temporal evolution is modeled using autoregressive residual formulation with multi-step supervision. The central claim is that this approach produces competitive forecasts of gas saturation and liquid-phase density with moderate cumulative errors over extended forecasting horizons.

Significance. If the empirical results hold, the work would contribute to the development of fast machine learning surrogates for complex multiphase flow simulations relevant to CO2 geological storage. Incorporating domain knowledge through transmissibility and geometric attributes in the graph structure is a notable strength that could improve generalization. The use of the public SPE11A benchmark allows for reproducibility and comparison.

major comments (2)
  1. [Abstract] The abstract claims 'competitive forecasts' and 'cumulative errors that remain moderate' without providing any quantitative metrics, error bars, baseline comparisons, training/validation splits, or ablation results. This absence prevents verification that the data supports the stated claim and is load-bearing for the paper's main contribution.
  2. [Abstract] The method is presented as sufficient to reproduce essential multiphase flow behavior including sharp interfaces and fingering without explicit physics constraints. However, no specific evidence or metrics addressing the fidelity of these features (e.g., interface sharpness or fingering patterns) are mentioned, which is critical given that data-driven models often struggle with such hyperbolic features.
minor comments (1)
  1. [Abstract] Consider adding at least one key numerical result to the abstract to make the performance claims more concrete and verifiable.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We address each major comment below and will revise the abstract accordingly to better substantiate the claims with quantitative support from the manuscript results.

read point-by-point responses
  1. Referee: [Abstract] The abstract claims 'competitive forecasts' and 'cumulative errors that remain moderate' without providing any quantitative metrics, error bars, baseline comparisons, training/validation splits, or ablation results. This absence prevents verification that the data supports the stated claim and is load-bearing for the paper's main contribution.

    Authors: We agree that the abstract, as a concise summary, would be strengthened by including key quantitative indicators drawn from the results section. The manuscript provides error metrics on gas saturation and liquid density, baseline comparisons, and details on the training procedure and splits. In revision we will update the abstract to incorporate representative quantitative values (e.g., cumulative error magnitudes and comparison outcomes) while preserving brevity. revision: yes

  2. Referee: [Abstract] The method is presented as sufficient to reproduce essential multiphase flow behavior including sharp interfaces and fingering without explicit physics constraints. However, no specific evidence or metrics addressing the fidelity of these features (e.g., interface sharpness or fingering patterns) are mentioned, which is critical given that data-driven models often struggle with such hyperbolic features.

    Authors: The abstract references the benchmark characteristics (sharp interfaces, fingering) and states that the model produces competitive forecasts of the primary fields that encode these phenomena. The geometry-conditioned anisotropic message passing is designed to capture directional transport and heterogeneity without additional physics losses. The results section contains visual comparisons and error analysis that reflect fidelity to these features. We will revise the abstract to briefly note the observed reproduction of interface and fingering behavior as evidenced by the saturation forecasts. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper describes a standard data-driven GNN surrogate trained on external SPE11A simulation data, with architecture choices (anisotropic message passing, autoregressive residuals) presented as design decisions rather than derived quantities. No equations, performance metrics, or central claims reduce by construction to fitted parameters or self-citations within the paper. Evaluation uses public benchmark data with independent ground truth. The derivation chain is self-contained against external simulation benchmarks, consistent with the reader's assessment of score 2.0 as a minor upper bound.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that the chosen graph encoding and message-passing bias capture the dominant transport physics; the only free parameters are the neural network weights fitted to simulation data. No new physical entities are postulated.

free parameters (1)
  • Neural network weights
    The parameters of the GNN layers are fitted during training to reproduce saturation and density fields from the benchmark simulations.
axioms (1)
  • domain assumption The grid can be represented as a graph with transmissibility-based edges enriched by geometric attributes that encode directional transport arising from permeability contrasts and heterogeneity
    This premise is invoked when the benchmark is reformulated as a graph and when interaction weights are computed via geometry-conditioned edge embeddings.

pith-pipeline@v0.9.1-grok · 5789 in / 1293 out tokens · 90583 ms · 2026-06-27T03:54:52.556438+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

12 extracted references · 4 canonical work pages

  1. [1]

    Advances in neural information processing systems29(2016)

    Defferrard,M.,Bresson,X.,Vandergheynst,P.:Convolutionalneuralnetworksongraphswithfast localized spectral filtering. Advances in neural information processing systems29(2016)

  2. [2]

    In:ICLR2024WorkshoponAI4DifferentialEquationsInScience (2024)

    Eliasof,M.,Haber,E.,Treister,E.,Schönlieb,C.B.:Data-drivenhigherorderdifferentialequations inspiredgraphneuralnetworks. In:ICLR2024WorkshoponAI4DifferentialEquationsInScience (2024). URLhttps://openreview.net/forum?id=rJReXWFByt

  3. [3]

    TransportinPorousMedia151(5),865–912(2024)

    Flemisch,B.,Nordbotten,J.M.,Fernø,M.,Juanes,R.,Both,J.W.,Class,H.,Delshad,M.,Doster, F.,Ennis-King,J.,Franc,J.,Geiger,S.,Gläser,D.,Green,C.,Gunning,J.,Hajibeygi,H.,Jackson, S.J., Jammoul, M., Karra, S., Li, J., Matthäi, S.K., Miller, T., Shao, Q., Spurin, C., Stauffer, P., Tchelepi, H., Tian, X., Viswanathan, H., Voskov, D., Wang, Y., Wapperom, M., Wheele...

  4. [4]

    Computers & Geosciences193, 105711(2024)

    Ju, X., Hamon, F.P., Wen, G., Kanfar, R., Araya-Polo, M., Tchelepi, H.A.: Learning CO2 plume migration in faulted reservoirs with graph neural networks. Computers & Geosciences193, 105711(2024). DOI10.1016/j.cageo.2024.105711. URLhttps://www.sciencedirect.com/ science/article/pii/S0098300424001948

  5. [5]

    Journal of Open Source Software10(105), 7357 (2025)

    Landa-Marbán, D., Sandve, T.H.: pyopmspe11: A Python framework using OPM flow for the SPE11 benchmark project. Journal of Open Source Software10(105), 7357 (2025). DOI 10.21105/joss.07357. URLhttps://doi.org/10.21105/joss.07357

  6. [6]

    SPEJournal29(05),2507– 2524 (2024)

    Nordbotten, J.M., Ferno, M.A., Flemisch, B., Kovscek, A.R., Lie, K.A.: The 11th society of petroleumengineerscomparativesolutionproject:Problemdefinition. SPEJournal29(05),2507– 2524 (2024)

  7. [7]

    In: International Conference on Learning Representations (ICLR) (2021)

    Pfaff, T., Fortunato, M., Sanchez-Gonzalez, A., Battaglia, P.W.: Learning mesh-based simulation with graph networks. In: International Conference on Learning Representations (ICLR) (2021). URLhttps://arxiv.org/abs/2010.03409

  8. [8]

    Computers & Mathematics with Applications81, 159–185 (2021)

    Rasmussen, A.F., Sandve, T.H., Bao, K., Lauser, A., Hove, J., Skaflestad, B., Klöfkorn, R., Blatt, M., Rustad, A.B., Sævareid, O., Lie, K.A., Thune, A.: The open porous media flow reser- voir simulator. Computers & Mathematics with Applications81, 159–185 (2021). DOI 10. 1016/j.camwa.2020.05.014. URLhttps://www.sciencedirect.com/science/article/ pii/S0898...

  9. [9]

    Transport in Porous Media151(5), 1199–1240 (2024)

    Saló-Salgado, L., Haugen, M., Eikehaug, K., Fernø, M., Nordbotten, J.M., Juanes, R.: Direct comparison of numerical simulations and experiments of CO2 injection and migration in geo- logic media: Value of local data and forecasting capability. Transport in Porous Media151(5), 1199–1240 (2024). DOI 10.1007/s11242-023-01972-y. URLhttps://doi.org/10.1007/ s1...

  10. [10]

    In: International conference on neural information processing, pp

    Seo,Y.,Defferrard,M.,Vandergheynst,P.,Bresson,X.:Structuredsequencemodelingwithgraph convolutional recurrent networks. In: International conference on neural information processing, pp. 362–373. Springer (2018)

  11. [11]

    Com- puter Methods in Applied Mechanics and Engineering449, 118476 (2026)

    Tesán, L., Iparraguirre, M.M., González, D., Martins, P., Cueto, E.: On the under-reaching phenomenon in message passing neural PDE solvers: Revisiting the CFL condition. Com- puter Methods in Applied Mechanics and Engineering449, 118476 (2026). URLhttps: //doi.org/10.1016/j.cma.2025.118476

  12. [12]

    In: The Eleventh International Conference on Learning Representa- tions (2023)

    Thürlemann,M.,Riniker,S.:Anisotropicmessagepassing:Graphneuralnetworkswithdirectional and long-range interactions. In: The Eleventh International Conference on Learning Representa- tions (2023)