pith. sign in

arxiv: 2606.11140 · v1 · pith:BWMDUDUEnew · submitted 2026-06-09 · ⚛️ physics.geo-ph · cs.AI· cs.LG· stat.AP· stat.ML

Data assimilation for subsurface flow using latent diffusion model parameterization: performance of ensemble-Kalman and Monte Carlo techniques

Pith reviewed 2026-06-27 10:33 UTC · model grok-4.3

classification ⚛️ physics.geo-ph cs.AIcs.LGstat.APstat.ML
keywords data assimilationlatent diffusion modelsensemble smootherMarkov chain Monte Carlosequential Monte Carlosubsurface flowgeological modelinguncertainty quantification
0
0 comments X

The pith

Monte Carlo sampling in latent space outperforms ensemble Kalman methods for subsurface data assimilation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper compares data assimilation algorithms for calibrating large 3D channelized geological models to well observations while keeping posterior models geologically plausible. Latent diffusion models reduce the problem dimension by mapping to a low-dimensional latent space, but this mapping is highly nonlinear and can limit the effectiveness of Kalman-gain updates. Model-space ESMDA reduces uncertainty substantially yet yields unrealistic models, whereas latent-space ESMDA preserves realism at the cost of limited uncertainty reduction. MCMC and SMC performed in the same latent space, supported by a fast surrogate flow model, produce lower data mismatch and greater uncertainty reduction than latent-space ESMDA and remain consistent with each other.

Core claim

MCMC and SMC are consistent with one another and achieve lower data mismatch and more uncertainty reduction than latent-space ESMDA; all models maintain geological realism due to the LDM parameterization, demonstrating that ensemble Kalman methods may provide overestimated posterior uncertainty with highly nonlinear parameterizations while rigorous Monte Carlo sampling enabled by fast surrogate models offers a more reliable alternative.

What carries the argument

Latent diffusion model parameterization that maps high-dimensional geological models to a low-dimensional latent variable, enabling dimensionality reduction of the inverse problem while preserving plausibility of posterior geomodels.

If this is right

  • Model-space ESMDA achieves significant uncertainty reduction but produces geologically unrealistic posterior models.
  • Latent-space ESMDA preserves geological realism but exhibits limited uncertainty reduction.
  • MCMC and SMC remain consistent with each other across the test cases.
  • Rigorous Monte Carlo sampling provides a more reliable alternative to ensemble Kalman methods when the parameterization is highly nonlinear.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Fast surrogate models can make rigorous sampling feasible for problems where direct simulation is computationally prohibitive.
  • The observed overestimation of uncertainty by ensemble Kalman methods may affect risk assessments in applications such as reservoir management.
  • The latent-space Monte Carlo approach could be tested on other nonlinear inverse problems that admit low-dimensional plausible parameterizations.

Load-bearing premise

The fast surrogate flow model accurately approximates well-rate responses sufficiently well to support reliable MCMC and SMC sampling in the LDM latent space.

What would settle it

A test case rerun with the full physics flow simulator instead of the surrogate, checking whether MCMC and SMC still produce lower data mismatch and greater uncertainty reduction than latent-space ESMDA.

Figures

Figures reproduced from arXiv: 2606.11140 by Guido Di Federico, Louis J. Durlofsky, Wenchao Teng.

Figure 1
Figure 1. Figure 1: Example Petrel realizations and associated scenario parameters ( [PITH_FULL_IMAGE:figures/full_fig_p006_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Example 3D-LDM generated realizations and associated scenario parameters. [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Case 1: posterior geomodels using model-space ESMDA (top row) and latent-space ESMDA [PITH_FULL_IMAGE:figures/full_fig_p013_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Case 1: model-space ESMDA results for selected wells, with localization (solid lines) and [PITH_FULL_IMAGE:figures/full_fig_p014_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Case 1: latent-space ESMDA results for selected wells, with localization (solid lines) and [PITH_FULL_IMAGE:figures/full_fig_p015_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Case 1: model-space (black) and latent-space (green) ESMDA results, with localization, for [PITH_FULL_IMAGE:figures/full_fig_p016_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Schematic of the flow surrogate model. Each block represents the output of the corresponding [PITH_FULL_IMAGE:figures/full_fig_p019_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Relative errors for water injection, oil production, and water production rates for the surrogate [PITH_FULL_IMAGE:figures/full_fig_p021_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Scatter plots comparing surrogate model and simulator results for cumulative water injection, [PITH_FULL_IMAGE:figures/full_fig_p021_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Field-level flow statistics from the simulator (red curves) and surrogate model (blue curves) [PITH_FULL_IMAGE:figures/full_fig_p022_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Randomly selected posterior realizations obtained using ESMDA, MCMC, and SMC (Case 1). [PITH_FULL_IMAGE:figures/full_fig_p027_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: DA results for selected wells (Case 1, DA for all methods performed in 3D-LDM latent space). [PITH_FULL_IMAGE:figures/full_fig_p028_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Randomly selected posterior realizations obtained using ESMDA, MCMC, and SMC for [PITH_FULL_IMAGE:figures/full_fig_p030_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: DA results for field rates for Case 2 (left column) and Case 3 (right column). DA for all [PITH_FULL_IMAGE:figures/full_fig_p031_14.png] view at source ↗
read the original abstract

Data assimilation (DA) in subsurface flow entails calibrating model parameters to match observed data, typically at wells, while preserving geological realism. Latent diffusion models (LDMs) provide efficient mappings from high-dimensional geological model space to a low-dimensional latent variable, reducing the dimensionality of the inverse problem while maintaining plausibility in posterior geomodels. However, the high nonlinearity in the LDM mapping may degrade the performance of Kalman-gain-based ensemble updates. We present a systematic comparison of DA algorithms applied to large-scale 3D channelized geomodels with hierarchical geological uncertainty. We compare model-space and latent-space DA using the ensemble smoother with multiple data assimilation (ESMDA), and demonstrate a key trade-off: model-space updates achieve significant uncertainty reduction but produce geologically unrealistic posterior models, while latent-space updates preserve realism but exhibit limited uncertainty reduction. Motivated by this, we explore rigorous Markov chain Monte Carlo (MCMC) and Sequential Monte Carlo (SMC) algorithms in the 3D-LDM latent space. To accommodate their high computational demands, we develop a fast surrogate flow model that approximates well-rate responses. MCMC and SMC are evaluated against ESMDA across three synthetic test cases, with DA performed in the LDM latent space. All models maintain geological realism due to the LDM parameterization. MCMC and SMC are consistent with one another and achieve lower data mismatch and more uncertainty reduction than latent-space ESMDA. Our overall results demonstrate that ensemble Kalman methods may provide overestimated posterior uncertainty with highly nonlinear parameterizations, while rigorous Monte Carlo sampling, enabled by fast surrogate models, can provide a more reliable alternative.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 1 minor

Summary. The manuscript compares data assimilation (DA) methods for subsurface flow calibration using latent diffusion models (LDMs) to parameterize large-scale 3D channelized geomodels. It first contrasts model-space and latent-space ensemble smoother with multiple data assimilation (ESMDA), noting a trade-off between uncertainty reduction (favoring model space) and geological realism (favoring latent space). It then develops Markov chain Monte Carlo (MCMC) and sequential Monte Carlo (SMC) samplers in the LDM latent space, supported by a fast surrogate flow model for well-rate responses, and reports that MCMC and SMC are consistent with each other, yield lower data mismatch, and achieve greater uncertainty reduction than latent-space ESMDA while preserving realism. The central conclusion is that ensemble Kalman methods may overestimate posterior uncertainty under highly nonlinear parameterizations, whereas rigorous Monte Carlo sampling with surrogates offers a more reliable alternative.

Significance. If the surrogate fidelity holds, the work provides a concrete demonstration that ensemble Kalman updates can be suboptimal for LDM-style nonlinear mappings and that surrogate-enabled MCMC/SMC can deliver improved posterior statistics on synthetic 3D cases. The systematic three-test-case design with hierarchical geological uncertainty and the explicit comparison of model- versus latent-space updates are useful contributions to the DA literature in geosciences.

major comments (1)
  1. [Surrogate flow model] The accuracy of the fast surrogate flow model for well-rate responses is load-bearing for all claims about MCMC and SMC performance (abstract and the paragraph on computational demands). Any systematic bias in the surrogate, especially in nonlinear regions visited by the chains, would invalidate the reported consistency between MCMC and SMC and their superiority over latent-space ESMDA. The manuscript must supply quantitative validation (e.g., relative L2 errors, scatter plots, or coverage statistics on held-out well-rate data) comparing the surrogate to the full forward model across the relevant latent-space regions; without this, the headline result cannot be assessed.
minor comments (1)
  1. [Abstract] The abstract asserts clear performance differences (lower data mismatch, more uncertainty reduction) but supplies no numerical values, error bars, or effect sizes; adding these would strengthen the summary.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We agree that the surrogate flow model's accuracy is critical to the MCMC and SMC results and will incorporate the requested quantitative validation in the revised version.

read point-by-point responses
  1. Referee: [Surrogate flow model] The accuracy of the fast surrogate flow model for well-rate responses is load-bearing for all claims about MCMC and SMC performance (abstract and the paragraph on computational demands). Any systematic bias in the surrogate, especially in nonlinear regions visited by the chains, would invalidate the reported consistency between MCMC and SMC and their superiority over latent-space ESMDA. The manuscript must supply quantitative validation (e.g., relative L2 errors, scatter plots, or coverage statistics on held-out well-rate data) comparing the surrogate to the full forward model across the relevant latent-space regions; without this, the headline result cannot be assessed.

    Authors: We agree that the surrogate's fidelity is essential for the validity of the MCMC/SMC comparisons and that the current manuscript does not provide the requested quantitative metrics. In the revised manuscript we will add a new subsection (and supporting figures) that reports relative L2 errors, scatter plots of predicted versus simulated well rates, and coverage statistics on held-out latent-space samples drawn from the same regions explored by the chains. These diagnostics will explicitly address potential bias in nonlinear regimes. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical method comparison on synthetic cases

full rationale

The paper reports numerical experiments comparing ESMDA, MCMC, and SMC applied to LDM-parameterized geomodels on three synthetic test cases. All performance claims (data mismatch, uncertainty reduction, consistency between MCMC/SMC) are obtained by direct forward evaluation of the surrogate and true flow models against held-out observations. No derivation reduces a claimed result to a fitted parameter or self-citation by construction; the surrogate accuracy is an explicit modeling assumption rather than a tautology. The work is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

Review performed on abstract only; full text unavailable so ledger is necessarily incomplete. The central claim rests on the domain assumption that LDMs preserve geological plausibility and on the accuracy of the surrogate model.

axioms (1)
  • domain assumption Latent diffusion models provide efficient mappings from high-dimensional geological model space to low-dimensional latent space while maintaining plausibility in posterior geomodels
    Stated directly in the abstract as the basis for dimensionality reduction.
invented entities (1)
  • fast surrogate flow model no independent evidence
    purpose: Approximates well-rate responses to enable computationally feasible MCMC and SMC in latent space
    Developed within the paper to address high computational demands of Monte Carlo methods

pith-pipeline@v0.9.1-grok · 5848 in / 1411 out tokens · 31123 ms · 2026-06-27T10:33:35.737004+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

60 extracted references · 55 canonical work pages · 5 internal anchors

  1. [1]

    Journal of Hydrology 590, 125443

    Coupling ensemble smoother and deep learning with generative adversarial networks to deal with non-Gaussianity in flow and transport data assimilation. Journal of Hydrology 590, 125443. doi:10.1016/j.jhydrol.2020.125443. Canchumuni, S.W., Emerick, A.A., Pacheco, M.A.C., 2019a. History matching geological facies models based on ensemble smoother and deep g...

  2. [2]

    Computational Geosciences 25, 433–466

    Recent de- velopments combining ensemble smoother and deep generative networks for facies history matching. Computational Geosciences 25, 433–466. doi:10.1007/s10596-020-10015-0. Chang, H., Zhang, D., Lu, Z.,

  3. [3]

    Journal of Computational Physics 229, 8011–8030

    History matching of facies distribution with the EnKF and level set parameterization. Journal of Computational Physics 229, 8011–8030. doi:10.1016/j.jcp.2010.07

  4. [4]

    SPE Reservoir Evaluation & Engineering 19, 278–293

    Integration of cumulative-distribution-function mapping with principal-component analysis for the history matching of channelized reservoirs. SPE Reservoir Evaluation & Engineering 19, 278–293. doi:10.2118/170636-PA. Chen, V., Dunlop, M.M., Papaspiliopoulos, O., Stuart, A.M.,

  5. [5]

    One Solution to the Mass Budget Problem for Planet Formation: Optically Thick Disks with Dust Scattering

    Dimension-robust mcmc in Bayesian inverse problems. doi:https://doi.org/10.48550/arXiv.1904.02127,arXiv:1803.03344. Chen, Y., Oliver, D.S.,

  6. [6]

    Deutsch, C

    Cross-covariances and localization for EnKF in multiphase flow data assimilation. Computational Geosciences 14, 579–601. doi:10.1007/s10596-009-9174-6. Cotter, S., Roberts, G., Stuart, A., White, D.,

  7. [7]

    Cotter, Gareth O

    doi:10.1214/13-STS421. Dai, C., Heng, J., Jacob, P.E., Whiteley, N.,

  8. [8]

    Liu , title =

    An invitation to sequential Monte Carlo sam- plers. Journal of the American Statistical Association 117, 1587–1600. doi:10.1080/01621459.2022. 2087659. Del Moral, P., Doucet, A., Jasra, A.,

  9. [9]

    Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68, 411–436

    Sequential Monte Carlo samplers. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68, 411–436. doi:10.1111/j.1467-9868.2006. 00553.x. Dhariwal, P., Nichol, A.,

  10. [10]

    Diffusion models beat GANs on image synthesis, in: Advances in Neural Information Processing Systems, Curran Associates, Inc., Virtual. pp. 8780–8794. doi:10.48550/ arXiv.2105.05233. 35 Preprint submitted to Computational Geosciences Di Federico, G., Durlofsky, L.J.,

  11. [11]

    G., Naranjo, S., Rideout, W

    Latent diffusion models for parameterization of facies-based geomodels and their use in data assimilation. Computers & Geosciences 194, 105755. doi:10.1016/j. cageo.2024.105755. Di Federico, G., Durlofsky, L.J.,

  12. [12]

    Mathematical Geosciences 58, 347–379

    Three-dimensional latent diffusion models for parameterizing and history matching facies systems under hierarchical uncertainty. Mathematical Geosciences 58, 347–379. doi:10.1007/s11004-025-10245-x. Emerick, A.,

  13. [13]

    doi:10.3997/2214-4609.202437027

    Practical considerations in the application of ensemble smoother for assimilating production and 4D seismic data, in: 19th European Conference on the Mathematics of Oil Recovery (ECMOR), Oslo, Norway. doi:10.3997/2214-4609.202437027. Emerick, A.A.,

  14. [14]

    Journal of Petroleum Science and Engineering 139, 219–239

    Analysis of the performance of ensemble-based assimilation of production and seismic data. Journal of Petroleum Science and Engineering 139, 219–239. doi:10.1016/j.petrol. 2016.01.029. Emerick, A.A., Reynolds, A.,

  15. [15]

    Evensen, G., Raanes, P

    Ensemble smoother with multiple data assimilation. Computers & Geosciences 55, 3–15. doi:10.1016/j.cageo.2012.03.011. Evensen, G.,

  16. [16]

    Journal of Geophysical Research: Oceans, 99 (C5), 10\,143--10\,162, doi:https://doi.org/10.1029/94JC00572

    Sequential data assimilation with a nonlinear quasi-geostrophic model using Monte Carlo methods to forecast error statistics. Journal of Geophysical Research: Oceans 99, 10143–10162. doi:10.1029/94JC00572. Exterkoetter, R., de Figueiredo, L.P., Bordignon, F.L., Emerick, A.A., Roisenberg, M., Rodrigues, B.B.,

  17. [17]

    Computers & Geosciences 189, 105619

    Ensemble smoother with fully convolutional VAE for seismic facies inversion. Computers & Geosciences 189, 105619. doi:10.1016/j.cageo.2024.105619. Fan, Y., Huang, T., Jiang, S.,

  18. [18]

    Mathematical Geosciences 56, 665–690

    Stochastic facies inversion with prior sampling by conditional generative adversarial networks based on training image. Mathematical Geosciences 56, 665–690. doi:10.1007/s11004-023-10119-0. Fu, W., Chen, Y., Wang, Z., Zheng, Q., Zhang, D.,

  19. [19]

    Journal of Hydrology 668, 135044

    Parameterization of complex geological models with PCA-guided adversarial diffusion for ensemble data assimilation. Journal of Hydrology 668, 135044. doi:10.1016/j.jhydrol.2026.135044. Furrer, R., Bengtsson, T.,

  20. [20]

    Journal of Multivariate Analysis 98, 227–255

    Estimation of high-dimensional prior and posterior covariance matrices in Kalman filter variants. Journal of Multivariate Analysis 98, 227–255. doi:10.1016/j.jmva.2006. 08.003. Gaspari, G., Cohn, S.E.,

  21. [21]

    Gnambs, T

    Construction of correlation functions in two and three dimensions. Quarterly Journal of the Royal Meteorological Society 125, 723–757. doi:10.1002/qj.49712555417. 36 Preprint submitted to Computational Geosciences Gelman, A., Rubin, D.B.,

  22. [22]

    Statistical Science , year = 1992, month = jan, volume =

    Inference from iterative simulation using multiple sequences. Statistical Science 7, 457–472. doi:10.1214/ss/1177011136. Han, Y., Durlofsky, L.J.,

  23. [23]

    Journal of Computational Physics 556, 114801

    Recurrent transformer U-net surrogate for flow modeling and data assimilation in subsurface formations with faults. Journal of Computational Physics 556, 114801. URL:https://www.sciencedirect.com/science/article/pii/S0021999126001518, doi:https:// doi.org/10.1016/j.jcp.2026.114801. Han, Y., Hamon, F.P., Jiang, S., Durlofsky, L.J.,

  24. [24]

    Advances in Water Resources 187, 104678

    Surrogate model for geological CO 2 storage and its use in hierarchical MCMC history matching. Advances in Water Resources 187, 104678. doi:10.1016/j.advwatres.2024.104678. Hastings, W.,

  25. [25]

    Keith Hastings

    doi:10.1093/biomet/57.1.97. He, J., Sarma, P., Durlofsky, L.J.,

  26. [27]

    Denoising Diffusion Probabilistic Models

    Denoising diffusion probabilistic models, in: Advances in Neural Information Processing Systems, Curran Associates, Inc., 34th International Conference on Neural Information Processing Systems (NeurIPS), Vancouver, Canada. doi:10.48550/arXiv.2006.11239. Hochreiter, S., Schmidhuber, J.,

  27. [28]

    Jafarpour, B.,

    doi:10.1175/1520-0493(2001)129<0123: ASEKFF>2.0.CO;2. Jafarpour, B.,

  28. [29]

    IEEE Transactions on Geoscience and Remote Sensing 49, 1520–1535

    Wavelet reconstruction of geologic facies from nonlinear dynamic flow measure- ments. IEEE Transactions on Geoscience and Remote Sensing 49, 1520–1535. doi:10.1109/TGRS. 2010.2089464. Jafarpour, B., McLaughlin, D.,

  29. [30]

    SPE Journal 14, 182–201

    Reservoir characterization with the discrete cosine transform. SPE Journal 14, 182–201. doi:10.2118/106453-PA. Kingma, D.P., Ba, J.,

  30. [31]

    Adam: A Method for Stochastic Optimization

    Adam: A method for stochastic optimization, in: International Conference on Learning Representations (ICLR) 2015, San Diego, CA, USA. doi:10.48550/arXiv.1412.6980. Kingma, D.P., Welling, M.,

  31. [32]

    Auto-Encoding Variational Bayes

    Auto-encoding variational Bayes. doi:10.48550/arXiv.1312.6114. Lacerda, J.M., Emerick, A.A., Pires, A.P.,

  32. [33]

    Lacerda, J

    Methods to mitigate loss of variance due to sampling errors in ensemble data assimilation with non-local model parameters. Journal of Petroleum Science and Engineering 172, 690–706. doi:10.1016/j.petrol.2018.08.056. Lacerda, J.M., Emerick, A.A., Pires, A.P.,

  33. [34]

    Computational Geosciences 25, 931–944

    Using a machine learning proxy for localization in ensem- ble data assimilation. Computational Geosciences 25, 931–944. doi:10.1007/s10596-020-10031-0. 37 Preprint submitted to Computational Geosciences Laloy, E., H´ erault, R., Jacques, D., Linde, N.,

  34. [35]

    doi:10.1016/j.cageo.2019.104333

    Gradient-based deterministic inversion of geophysical data with generative adversarial networks: is it feasible? Computers & Geosciences 133, 104333. doi:10.1016/j.cageo.2019.104333. Lee, D., Ovanger, O., Eidsvik, J., Aune, E., Skauvold, J., Hauge, R.,

  35. [36]

    Computers & Geosciences 194, 105750

    Latent diffusion model for conditional reservoir facies generation. Computers & Geosciences 194, 105750. doi:10.1016/j.cageo. 2024.105750. Lin, Z., He, Q.,

  36. [37]

    doi:10.48550/arXiv.2512.22421

    Differentiable inverse modeling with physics-constrained latent diffusion for het- erogeneous subsurface parameter fields. doi:10.48550/arXiv.2512.22421. Liu, N., Oliver, D.S.,

  37. [38]

    SPE Reservoir Evaluation & Engineering 8, 470–477

    Critical evaluation of the ensemble Kalman filter on history matching of geologic facies. SPE Reservoir Evaluation & Engineering 8, 470–477. doi:10.2118/92867-PA. Liu, Y., Durlofsky, L.J.,

  38. [39]

    Computers & Geosciences 148, 1046–1076

    3D CNN-PCA: A deep-learning-based parameterization for complex geomodels. Computers & Geosciences 148, 1046–1076. doi:10.1016/j.cageo.2020.104676. Lopez-Alvis, J., Laloy, E., Nguyen, F., Hermans, T.,

  39. [40]

    Computers & Geosciences 152, 104762

    Deep generative models in inversion: the impact of the generator’s nonlinearity and development of a new approach based on a variational autoencoder. Computers & Geosciences 152, 104762. doi:10.1016/j.cageo.2021.104762. Luo, X., Bhakta, T., Nævdal, G.,

  40. [41]

    M´ en´ etrier, B., Montmerle, T., Michel, Y., and Berre, L

    Correlation-based adaptive localization with applications to ensemble-based 4D-seismic history matching. SPE Journal 23, 396–427. doi:10.2118/185936-PA. Merzoug, A., Pyrcz, M.,

  41. [42]

    Computers & Geosciences 207, 106076

    Diffusion models for multivariate subsurface generation and efficient proba- bilistic inversion. Computers & Geosciences 207, 106076. doi:10.1016/j.cageo.2025.106076. Mo, S., Zhu, Y., Zabaras, N., Shi, X., Wu, J.,

  42. [43]

    Water Resources Research 55, 703–728

    Deep convolutional encoder-decoder networks for uncertainty quantification of dynamic multiphase flow in heterogeneous media. Water Resources Research 55, 703–728. doi:10.1029/2018WR023528. Mohd Razak, S., Jafarpour, B.,

  43. [44]

    [Online; accessed June 6, 2025]

    Reservoir modelling and simulation platform (version 24.3). [Online; accessed June 6, 2025]. Rombach, R., Blattmann, A., Lorenz, D., Esser, P., Ommer, B.,

  44. [46]

    Trip-bench: A benchmark for long-horizon interactive agents in real-world scenarios.CoRR, abs/2602.01675, 2026

    Towards geological inference with process-based and deep generative modeling, part 2: inversion of fluvial deposits and latent-space disentanglement. doi:10.48550/arXiv. 2510.17478,arXiv:2510.17478. arXiv preprint arXiv:2510.17478. Ronneberger, O., Fischer, P., Brox, T.,

  45. [47]

    U-net: Convolutional networks for biomedical image segmentation

    U-net: convolutional networks for biomedical image segmentation, in: Proceedings of the Medical Image Computing and Computer-Assisted Intervention (MICCAI), Springer, Munich, Germany. doi:10.1007/978-3-319-24574-4\_28. Sayarpour, M., Kabir, C., Sepehrnoori, K., Lake, L.W.,

  46. [48]

    Journal of Petroleum Science and Engineering 78, 96–108

    Probabilistic history matching with the capacitance–resistance model in waterfloods: a precursor to numerical modeling. Journal of Petroleum Science and Engineering 78, 96–108. doi:10.1016/j.petrol.2011.05.005. Schlumberger,

  47. [49]

    [Online; accessed May 10, 2024]

    Petrel e&p software platform (version 2019.2). [Online; accessed May 10, 2024]. Seabra, G.S., M¨ ucke, N.T., Silva, V.L.S., Emerick, A.A., Voskov, D., Vossepoel, F.,

  48. [50]

    doi:10.48550/arXiv.2511.05266

    Integrating score-based diffusion models with machine learning-enhanced localization for advanced data assimila- tion in geological carbon storage. doi:10.48550/arXiv.2511.05266. Song, J., Meng, C., Ermon, S., 2021a. Denoising diffusion implicit models, in: Proceedings of the 9th International Conference on Learning Representations (ICLR), Vienna, Austria...

  49. [51]

    Petroleum Exploration and Development 53, 205–220

    Geomodelling of multi-scenario non- stationary reservoirs with enhanced gansim. Petroleum Exploration and Development 53, 205–220. doi:10.1016/S1876-3804(26)60685-4. Song, S., Zhang, D., Mukerji, T., Wang, N.,

  50. [52]

    Journal of Hydrology 620, 129493

    GANSim-surrogate: an integrated framework for stochastic conditional geomodelling. Journal of Hydrology 620, 129493. doi:10.1016/j.jhydrol. 2023.129493. 39 Preprint submitted to Computational Geosciences Tang, M., Ju, X., Durlofsky, L.J.,

  51. [53]

    International Journal of Greenhouse Gas Control 118, 103692

    Deep-learning-based coupled flow-geomechanics surrogate model for CO2 sequestration. International Journal of Greenhouse Gas Control 118, 103692. doi:10.1016/ j.ijggc.2022.103692. Tang, M., Liu, Y., Durlofsky, L.J.,

  52. [54]

    Journal of Computational Physics 413, 109456

    A deep-learning-based surrogate model for data assimilation in dynamic subsurface flow problems. Journal of Computational Physics 413, 109456. doi:10.1016/ j.jcp.2020.109456. Tang, M., Liu, Y., Durlofsky, L.J.,

  53. [55]

    doi:10.2118/203924-MS

    History matching complex 3D systems using deep-learning-based surrogate flow modeling and CNN-PCA geological parameterization, in: SPE Reservoir Simulation Conference. doi:10.2118/203924-MS. Tartakovsky, A.M., Marrero, C.O., Perdikaris, P., Tartakovsky, G.D., Barajas-Solano, D.,

  54. [56]

    Water Resources Research 56, e2019WR026731

    Physics- informed deep neural networks for learning parameters and constitutive relationships in subsurface flow problems. Water Resources Research 56, e2019WR026731. doi:10.1029/2019WR026731. Vehtari, A., Gelman, A., Simpson, D., Carpenter, B., B¨ urkner, P.C.,

  55. [57]

    Bayesian Analysis , author =

    doi:10.1214/20-ba1221. Vo, H.X., Durlofsky, L.J.,

  56. [58]

    Journal of Computational Physics 466, 111419

    Surrogate and inverse modeling for two-phase flow in porous media via theory-guided convolutional neural network. Journal of Computational Physics 466, 111419. doi:10.1016/j.jcp.2022.111419. Wang, Z., Chen, Y., Chen, G., Zheng, Q., Wu, T., Zhang, D., 2025a. Generative emulation and un- certainty quantification of geological CO 2 storage with conditional d...

  57. [59]

    Advances in Water Resources 163, 104180

    U-FNO—an enhanced Fourier neural operator-based deep-learning model for multiphase flow. Advances in Water Resources 163, 104180. doi:10.1016/j.advwatres.2022.104180. Yu, J., Jafarpour, B.,

  58. [60]

    doi:10.2118/212177-MS

    Deep learning-based disentangled parametrization for model calibration under multiple geologic scenarios, in: SPE Reservoir Simulation Conference, Galveston, TX, USA. doi:10.2118/212177-MS. 40 Preprint submitted to Computational Geosciences Zhan, C., Dai, Z., Jiao, J.J., Soltanian, M.R., Yin, H., Carroll, K.C.,

  59. [61]

    Geophysical Research Letters 52, e2024GL114298

    Toward artificial general intelligence in hydrogeological modeling with an integrated latent diffusion framework. Geophysical Research Letters 52, e2024GL114298. doi:10.1029/2024GL114298. Zhang, Y., Oliver, D.S.,

  60. [62]

    A Appendix A.1 Derivation of the MSE taper To derive Eq

    Improving the ensemble estimate of the Kalman gain by bootstrap sampling. Mathematical Geosciences 42, 327–345. doi:10.1007/s11004-010-9267-8. Zhao, Y., Forouzanfar, F., Reynolds, A.C.,