arxiv: 2605.02405 · v1 · submitted 2026-05-04 · 💻 cs.LG

Recognition: unknown

Closed-Loop CO2 Storage Control With History-Based Reinforcement Learning and Latent Model-Based Adaptation

Sofianos Panagiotis Fotias , Vassilis Gaganis

Authors on Pith no claims yet

Pith reviewed 2026-05-09 16:11 UTC · model grok-4.3

classification 💻 cs.LG

keywords closed-loop CO2 storagereinforcement learninghistory-conditioned policieslatent model adaptationreservoir simulationpartially observable controlwell-level observations

0 comments

The pith

History-conditioned reinforcement learning recovers nearly all privileged-state performance for CO2 storage control with only well-level data.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper formulates CO2 injection and brine-production control as a partially observable sequential decision problem and trains deep reinforcement learning controllers on high-fidelity reservoir simulations. It compares privileged-state, well-only, history-conditioned, masking-curriculum, and asymmetric teacher-student policies to measure the benefit of temporal well-response information. History-conditioned policies recover nearly all privileged-state performance while depending solely on deployable well-level observations. A latent model-based adaptation pipeline reuses nominal latent dynamics to retune controllers under injector failure, leakage-induced shifts, and connectivity changes, and it outperforms direct model-free retuning when the same limited scenario-specific simulation budget is available. This supplies a simulator-budget-aware alternative to repeated online history matching and re-optimization.

Core claim

Closed-loop management of geological CO2 storage can be handled by history-conditioned deep reinforcement learning policies that recover nearly all of the privileged-state performance while using only deployable well-level information, together with a latent model-based adaptation pipeline that reuses nominal latent dynamics and retunes controllers more effectively than direct model-free retuning under the same scenario-specific real-simulator budget for abnormal cases involving injector failure, leakage, and compartmentalized connectivity.

What carries the argument

History-conditioned policies and latent model-based retuning pipeline, which reuses nominal latent dynamics to adapt controllers to changed reservoir conditions using only realistic observations and limited additional simulations.

Load-bearing premise

High-fidelity reservoir simulations accurately represent real-world reservoir behavior and the latent dynamics model captures the necessary changes under failures, leakage, and connectivity shifts.

What would settle it

Deploy the history-conditioned and latent-adapted controllers on a real CO2 storage site or a physical laboratory analog under documented injector failure or leakage conditions and measure whether achieved storage efficiency and safety metrics match the simulation predictions.

Figures

Figures reproduced from arXiv: 2605.02405 by Sofianos Panagiotis Fotias, Vassilis Gaganis.

**Figure 1.** Figure 1: Closed-loop CCS control formulation considered in this work. The reservoir simulator evolves view at source ↗

**Figure 2.** Figure 2: Training-deployment setting considered in this work. Policies are trained on ensembles of prior view at source ↗

**Figure 3.** Figure 3: Baseline information regimes. The privileged-state benchmark uses dense simulator fields and view at source ↗

**Figure 4.** Figure 4: History-conditioned model. Current well observations and a rolling history of public well view at source ↗

**Figure 5.** Figure 5: Masked-critic curriculum. During training, the critic receives progressively masked spatial sim view at source ↗

**Figure 6.** Figure 6: Asymmetric teacher-student model. During training, privileged teacher critics use dense spatial view at source ↗

**Figure 7.** Figure 7: Model-based pipeline used in this work. Public observations are mapped to a deployable latent view at source ↗

**Figure 8.** Figure 8: Scenario 1 methodology. The actor outputs an 11-dimensional nominal action, after which the view at source ↗

**Figure 9.** Figure 9: Residual world-model adaptation for Scenarios 2 and 3. Abnormal latent transitions are encoded view at source ↗

**Figure 10.** Figure 10: Test return as a function of training epoch for the five model-free variants. The well-only base view at source ↗

**Figure 11.** Figure 11: Scenario 0: nominal model-based retention. Real-environment evaluation return as a function view at source ↗

**Figure 12.** Figure 12: Scenario 1: adaptation under known injector failure. Real-environment evaluation return as a view at source ↗

**Figure 13.** Figure 13: Scenario 2: adaptation under leakage-induced dynamics and reward shift. Real-environment view at source ↗

**Figure 14.** Figure 14: Scenario 3: adaptation under compartmentalized connectivity shift. Real-environment eval view at source ↗

read the original abstract

Closed-loop management of geological CO2 storage requires control policies that adapt to uncertain reservoir behavior while relying on observations that are realistically available during operation. This work formulates CO2 injection and brine-production control as a partially observable sequential decision problem and studies deployable deep reinforcement-learning controllers trained with high-fidelity reservoir simulation. We first compare privileged-state, well-only, history-conditioned, masking-curriculum, and asymmetric teacher-student model-free policies in order to quantify the value of temporal well-response information and training-time privileged simulator states. We then evaluate a latent model-based adaptation pipeline that reuses nominal latent dynamics and retunes controllers under known injector failure, leakage-induced dynamics and reward shift, and compartmentalized reservoir connectivity. The results show that history-conditioned policies recover nearly all of the privileged-state performance while using only deployable well-level information, and that latent model-based retuning outperforms direct model-free retuning under the same scenario-specific real-simulator budget in the abnormal operating cases. The proposed framework therefore provides a simulator-budget-aware alternative to repeated online history matching and re-optimization for closed-loop CO2 storage control.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper formulates CO2 injection and brine-production control as a partially observable Markov decision process and trains deep RL policies on high-fidelity reservoir simulators. It compares privileged-state, well-only, history-conditioned, masking-curriculum, and asymmetric teacher-student policies, claiming that history-conditioned policies recover nearly all privileged-state performance using only deployable well-level observations. It further proposes a latent model-based adaptation pipeline that reuses nominal latent dynamics and retunes controllers for three known abnormality classes (injector failure, leakage-induced dynamics/reward shift, compartmentalized connectivity), reporting that this outperforms direct model-free retuning under a fixed scenario-specific simulator budget.

Significance. If the empirical claims hold under broader validation, the work provides a practical, simulator-budget-aware alternative to repeated online history matching for closed-loop CO2 storage. The demonstration that temporal well-response history suffices to approach privileged performance, together with the latent-adaptation results, directly addresses partial observability and model uncertainty in subsurface control. The explicit focus on deployable observations and limited retuning budgets is a concrete strength that could inform real-world deployment of RL in energy systems.

major comments (2)

[§4, §5.2] §4 (Experimental Setup) and §5.2 (Abnormal Scenario Results): The headline claims that history-conditioned policies recover nearly all privileged performance and that latent retuning outperforms model-free retuning rest entirely on high-fidelity reservoir simulations for three known abnormality classes. No cross-validation against field data, out-of-distribution simulator variants, or unanticipated dynamics (e.g., fault reactivation or multiphase hysteresis) is reported; this is load-bearing because the latent model is reused from the nominal case and only retuned for the tested shifts.
[§5.1, Table 2] §5.1 and Table 2 (Policy Comparison): Performance gains are reported without error bars, confidence intervals, or statistical significance tests across random seeds or reservoir realizations. This makes it impossible to determine whether the reported near-recovery of privileged performance is robust or within noise, directly affecting the central claim about the value of history conditioning.

minor comments (2)

[§3.3] Notation for the latent dynamics model (e.g., how the encoder/decoder are trained and how retuning is performed) is introduced without a clear equation reference or pseudocode, making the adaptation pipeline hard to reproduce from the text alone.
[Abstract, §1] The abstract and §1 claim 'nearly all' recovery but do not quantify the gap (e.g., percentage of cumulative reward or constraint violation) relative to privileged policies; adding these numbers would strengthen the comparison.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive feedback. We address each major comment below, indicating where we will revise the manuscript to strengthen the presentation and where we provide clarification on the scope of the study.

read point-by-point responses

Referee: [§4, §5.2] §4 (Experimental Setup) and §5.2 (Abnormal Scenario Results): The headline claims that history-conditioned policies recover nearly all privileged performance and that latent retuning outperforms model-free retuning rest entirely on high-fidelity reservoir simulations for three known abnormality classes. No cross-validation against field data, out-of-distribution simulator variants, or unanticipated dynamics (e.g., fault reactivation or multiphase hysteresis) is reported; this is load-bearing because the latent model is reused from the nominal case and only retuned for the tested shifts.

Authors: We agree that the evaluation relies on high-fidelity reservoir simulations for three representative abnormality classes (injector failure, leakage-induced shifts, and compartmentalization). This is standard practice in the field given the prohibitive cost and limited availability of real CO2 storage field data for controlled experimentation. The latent adaptation pipeline is explicitly designed to reuse nominal dynamics and retune only for known shift classes under a fixed simulator budget, which is the central methodological contribution. We will revise §5.2 and add a new limitations paragraph to explicitly state that results are conditioned on the tested shift classes, discuss the challenges of unanticipated dynamics (e.g., fault reactivation), and outline how online model adaptation could be extended in future work. No field-data cross-validation is feasible within the current scope. revision: partial
Referee: [§5.1, Table 2] §5.1 and Table 2 (Policy Comparison): Performance gains are reported without error bars, confidence intervals, or statistical significance tests across random seeds or reservoir realizations. This makes it impossible to determine whether the reported near-recovery of privileged performance is robust or within noise, directly affecting the central claim about the value of history conditioning.

Authors: We acknowledge the omission of variability measures. In the revised manuscript we will re-run all policy comparisons across at least five independent random seeds and multiple reservoir realizations (where the simulator permits), report mean performance with standard deviation or 95% confidence intervals in Table 2 and the associated figures, and include paired statistical significance tests (e.g., Wilcoxon or t-tests) between history-conditioned and baseline policies to substantiate the claim that history conditioning recovers nearly all privileged-state performance. revision: yes

Circularity Check

0 steps flagged

No significant circularity; empirical RL results in simulators

full rationale

The paper's core claims rest on training and evaluating RL policies (privileged, history-conditioned, latent model-based adaptation) inside high-fidelity reservoir simulators for nominal and abnormal scenarios. Performance comparisons and adaptation advantages are obtained by direct simulation rollouts under fixed budgets, not by any self-referential definition, fitted parameter renamed as prediction, or load-bearing self-citation chain. The derivation chain consists of standard MDP formulation, policy optimization, and empirical benchmarking; no equation or result reduces to its inputs by construction. Minor self-citations, if present, are not load-bearing for the reported outcomes.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The work rests on standard RL training assumptions and the fidelity of reservoir simulators. No new physical entities are introduced.

free parameters (1)

RL training hyperparameters (learning rates, network architectures, reward weights)
Typical in deep RL but unspecified in abstract; affect policy performance and adaptation results.

axioms (1)

domain assumption High-fidelity reservoir simulations accurately capture real CO2 storage dynamics including failures and leaks
Invoked throughout training, evaluation, and adaptation pipeline.

pith-pipeline@v0.9.0 · 5500 in / 1295 out tokens · 49123 ms · 2026-05-09T16:11:36.032983+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

77 extracted references · 3 canonical work pages · 2 internal anchors

[1]

Technologies and infrastructures underpinning future co2 value chains: A comprehensive review and comparative analysis.Renewable and Sustainable Energy Re- views, 85:46–68, 2018

Sean M Jarvis and Sheila Samsatli. Technologies and infrastructures underpinning future co2 value chains: A comprehensive review and comparative analysis.Renewable and Sustainable Energy Re- views, 85:46–68, 2018

2018
[2]

Paolo Gabrielli, Matteo Gazzani, and Marco Mazzotti. The role of carbon capture and utilization, carbon capture and storage, and biomass to enable a net-zero-co2 emissions chemical industry.In- dustrial & Engineering Chemistry Research, 59(15):7033–7045, 2020. 25

2020
[3]

The role of carbon capture and storage (ccs) technologies in a net-zero carbon future

Mai Bui, Graeme Douglas Puxty, Matteo Gazzani, Salman Masoudi Soltani, and Carlos Pozo. The role of carbon capture and storage (ccs) technologies in a net-zero carbon future. 2021

2021
[4]

Introduction to geological storage.Carbon Capture and Storage; Elsevier: Amsterdam, The Netherlands, pages 285–304, 2017

SA Rackley and SA Rackley. Introduction to geological storage.Carbon Capture and Storage; Elsevier: Amsterdam, The Netherlands, pages 285–304, 2017

2017
[5]

Criteria for co2 storage in geological formations.Podzemni radovi, (32):61–74, 2018

Lola Tomi´ c, Vesna Karovi´ c Mariˇ ci´ c, Duˇ san Danilovi´ c, and Miroslav Crnogorac. Criteria for co2 storage in geological formations.Podzemni radovi, (32):61–74, 2018

2018
[6]

Co2 storage in deep saline aquifers

Xiaoyan Ji and Chen Zhu. Co2 storage in deep saline aquifers. InNovel materials for carbon dioxide mitigation technology, pages 299–332. Elsevier, 2015

2015
[7]

Review of co2 storage efficiency in deep saline aquifers.International Journal of Greenhouse Gas Control, 40:188–202, 2015

Stefan Bachu. Review of co2 storage efficiency in deep saline aquifers.International Journal of Greenhouse Gas Control, 40:188–202, 2015

2015
[8]

Geological storage of co2 in saline aquifers—a review of the experience from existing storage operations.International journal of greenhouse gas control, 4(4):659–667, 2010

Karsten Michael, Alexandra Golab, Valeriya Shulakova, Jonathan Ennis-King, Guy Allinson, Sandeep Sharma, and Toby Aiken. Geological storage of co2 in saline aquifers—a review of the experience from existing storage operations.International journal of greenhouse gas control, 4(4):659–667, 2010

2010
[9]

Co2 storage in depleted or depleting oil and gas fields: what can we learn from existing projects?Energy Procedia, 114:5680–5690, 2017

Sarah Hannis, Jiemin Lu, Andy Chadwick, Sue Hovorka, Karen Kirk, Katherine Romanak, and Jonathan Pearce. Co2 storage in depleted or depleting oil and gas fields: what can we learn from existing projects?Energy Procedia, 114:5680–5690, 2017

2017
[10]

Co 2-eor/sequestration: Current trends and future horizons

Erfan Mohammadian, Badrul Mohamed Jan, Amin Azdarpour, Hossein Hamidi, Nur Hidayati Binti Othman, Aqilah Dollah, Siti Nurliyana Binti Che Mohamed Hussein, and Rozana Azrina Binti Sazali. Co 2-eor/sequestration: Current trends and future horizons. InEnhanced Oil Recovery Processes-New Technologies. IntechOpen, 2019

2019
[11]

Co2 sequestration in depleted oil and gas reservoirs—caprock characterization and storage capacity.Energy Conversion and Management, 47(11-12):1372–1382, 2006

Zhaowen Li, Mingzhe Dong, Shuliang Li, and Sam Huang. Co2 sequestration in depleted oil and gas reservoirs—caprock characterization and storage capacity.Energy Conversion and Management, 47(11-12):1372–1382, 2006

2006
[12]

Carbon capture, utilization, and storage in saline aquifers: Sub- surface policies, development plans, well control strategies and optimization approaches—a review

Ismail Ismail and Vassilis Gaganis. Carbon capture, utilization, and storage in saline aquifers: Sub- surface policies, development plans, well control strategies and optimization approaches—a review. Clean Technologies, 5(2):609–637, 2023

2023
[13]

Code intercomparison builds confidence in numerical simulation models for geologic disposal of co2.Energy, 29(9-10):1431–1444, 2004

Karsten Pruess, Julio Garc´ ıa, Tony Kovscek, Curt Oldenburg, Jonny Rutqvist, Carl Steefel, and Tianfu Xu. Code intercomparison builds confidence in numerical simulation models for geologic disposal of co2.Energy, 29(9-10):1431–1444, 2004

2004
[14]

A benchmark study on problems related to co 2 storage in geologic formations: summary and discussion of the results

Holger Class, Anozie Ebigbo, Rainer Helmig, Helge K Dahle, Jan M Nordbotten, Michael A Celia, Pascal Audigane, Melanie Darcis, Jonathan Ennis-King, Yaqing Fan, et al. A benchmark study on problems related to co 2 storage in geologic formations: summary and discussion of the results. Computational geosciences, 13:409–434, 2009

2009
[15]

Optimal well placement and brine extraction for pressure management during co2 sequestration.International Journal of Greenhouse Gas Control, 42:175–187, 2015

Abdullah Cihan, Jens T Birkholzer, and Marco Bianchi. Optimal well placement and brine extraction for pressure management during co2 sequestration.International Journal of Greenhouse Gas Control, 42:175–187, 2015

2015
[16]

Optimization of well placement, co2 injection rates, and brine cycling for geological carbon sequestration.International Journal of Greenhouse Gas Control, 10:100–112, 2012

David A Cameron and Louis J Durlofsky. Optimization of well placement, co2 injection rates, and brine cycling for geological carbon sequestration.International Journal of Greenhouse Gas Control, 10:100–112, 2012

2012
[17]

Co2 storage in geological media: Role, means, status and barriers to deployment

Stefan Bachu. Co2 storage in geological media: Role, means, status and barriers to deployment. Progress in energy and combustion science, 34(2):254–273, 2008. 26

2008
[18]

The acceptability of co2 capture and storage (ccs) in europe: An assessment of the key determining factors: Part 1

Heleen De Coninck, Todd Flach, Paul Curnow, Peter Richardson, Jason Anderson, Simon Shackley, Gudmundur Sigurthorsson, and David Reiner. The acceptability of co2 capture and storage (ccs) in europe: An assessment of the key determining factors: Part 1. scientific, technical and economic dimensions.International Journal of Greenhouse Gas Control, 3(3):333–...

2009
[19]

Active pressure management through brine production for basin-wide deployment of geologic carbon sequestration.International Journal of Greenhouse Gas Control, 61:155–167, 2017

Karl W Bandilla and Michael A Celia. Active pressure management through brine production for basin-wide deployment of geologic carbon sequestration.International Journal of Greenhouse Gas Control, 61:155–167, 2017

2017
[20]

Pre-injection brine production in co2 storage reservoirs: An approach to augment the development, operation, and performance of ccs while generating water

Thomas A Buscheck, Jeffrey M Bielicki, Joshua A White, Yunwei Sun, Yue Hao, William L Bourcier, Susan A Carroll, and Roger D Aines. Pre-injection brine production in co2 storage reservoirs: An approach to augment the development, operation, and performance of ccs while generating water. International Journal of Greenhouse Gas Control, 54:499–512, 2016

2016
[21]

Estimating the net costs of brine production and disposal to expand pressure-limited dynamic capacity for basin-scale co2 storage in a saline formation

Steven T Anderson and Hossein Jahediesfanjani. Estimating the net costs of brine production and disposal to expand pressure-limited dynamic capacity for basin-scale co2 storage in a saline formation. International Journal of Greenhouse Gas Control, 102:103161, 2020

2020
[22]

Investigation of co2 storage capacity in open saline aquifers with numerical models.Procedia Engineering, 31:886–892, 2012

Yang Wang, Yaqin Xu, and Keni Zhang. Investigation of co2 storage capacity in open saline aquifers with numerical models.Procedia Engineering, 31:886–892, 2012

2012
[23]

Multi-objective optimization

Kalyanmoy Deb, Karthik Sindhya, and Jussi Hakanen. Multi-objective optimization. InDecision sciences, pages 161–200. CRC Press, 2016

2016
[24]

A holistic review on artificial intelligence techniques for well placement optimization problem.Advances in engineering software, 141:102767, 2020

Jahedul Islam, Pandian M Vasant, Berihun Mamo Negash, Moacyr Bartholomeu Laruccia, Myo Myint, and Junzo Watada. A holistic review on artificial intelligence techniques for well placement optimization problem.Advances in engineering software, 141:102767, 2020

2020
[25]

Learning surrogate models for simulation- based optimization.AIChE Journal, 60(6):2211–2227, 2014

Alison Cozad, Nikolaos V Sahinidis, and David C Miller. Learning surrogate models for simulation- based optimization.AIChE Journal, 60(6):2211–2227, 2014

2014
[26]

A Tutorial on Bayesian Optimization

Peter I Frazier. A tutorial on bayesian optimization.arXiv preprint arXiv:1807.02811, 2018

work page internal anchor Pith review arXiv 2018
[27]

An introduction to continuity, extrema, and related topics for general gaussian processes

Robert J Adler. An introduction to continuity, extrema, and related topics for general gaussian processes. IMS, 1990

1990
[28]

Aleatory or epistemic? does it matter?Structural safety, 31(2):105–112, 2009

Armen Der Kiureghian and Ove Ditlevsen. Aleatory or epistemic? does it matter?Structural safety, 31(2):105–112, 2009

2009
[29]

Aleatoric and epistemic uncertainty in machine learning: An introduction to concepts and methods.Machine learning, 110(3):457–506, 2021

Eyke H¨ ullermeier and Willem Waegeman. Aleatoric and epistemic uncertainty in machine learning: An introduction to concepts and methods.Machine learning, 110(3):457–506, 2021

2021
[30]

Optimization of well placement in carbon capture and storage (ccs): Bayesian optimization framework under permutation invariance

Sofianos Panagiotis Fotias, Ismail Ismail, and Vassilis Gaganis. Optimization of well placement in carbon capture and storage (ccs): Bayesian optimization framework under permutation invariance. Applied Sciences, 14(8):3528, 2024

2024
[31]

Improved reservoir management through optimal control and continuous model updating

DR Brouwer, G Nœvdal, JD Jansen, Erland H Vefring, and CPJW Van Kruijsdijk. Improved reservoir management through optimal control and continuous model updating. InSPE Annual Technical Conference and Exhibition?, pages SPE–90149. SPE, 2004

2004
[32]

Optimizing the performance of smart wells in com- plex reservoirs using continuously updated geological models.Journal of Petroleum Science and Engineering, 48(3-4):254–264, 2005

Inegbenose Aitokhuehi and Louis J Durlofsky. Optimizing the performance of smart wells in com- plex reservoirs using continuously updated geological models.Journal of Petroleum Science and Engineering, 48(3-4):254–264, 2005. 27

2005
[33]

Efficient real-time reservoir manage- ment using adjoint-based optimal control and model updating.Computational Geosciences, 10:3–36, 2006

Pallav Sarma, Louis J Durlofsky, Khalid Aziz, and Wen H Chen. Efficient real-time reservoir manage- ment using adjoint-based optimal control and model updating.Computational Geosciences, 10:3–36, 2006

2006
[34]

Closed-loop reservoir management

Jan-Dirk Jansen, SD Douma, Dr R Brouwer, PMJ Van den Hof, OH Bosgra, and AW Heemink. Closed-loop reservoir management. InSPE Reservoir Simulation Conference?, pages SPE–119098. SPE, 2009

2009
[35]

Production optimization in closed-loop reservoir management.SPE journal, 14(03):506–523, 2009

Chunhong Wang, Gaoming Li, and Albert C Reynolds. Production optimization in closed-loop reservoir management.SPE journal, 14(03):506–523, 2009

2009
[36]

Comprehensive framework for gradient-based optimization in closed-loop reservoir management.Computational Geosciences, 19:877–897, 2015

Vladislav Bukshtynov, Oleg Volkov, Louis J Durlofsky, and Khalid Aziz. Comprehensive framework for gradient-based optimization in closed-loop reservoir management.Computational Geosciences, 19:877–897, 2015

2015
[37]

A derivative-free approach for the estimation of porosity and permeability using time-lapse seismic and production data.Journal of Geophysics and Engineering, 7(4):351–368, 2010

Mohsen Dadashpour, David Echeverria Ciaurri, Tapan Mukerji, Jon Kleppe, and Martin Landrø. A derivative-free approach for the estimation of porosity and permeability using time-lapse seismic and production data.Journal of Geophysics and Engineering, 7(4):351–368, 2010

2010
[38]

Multilevel strategies and geological parameterizations for history matching complex reservoir models.SPE Journal, 25(01):081–104, 2020

Yimin Liu and Louis J Durlofsky. Multilevel strategies and geological parameterizations for history matching complex reservoir models.SPE Journal, 25(01):081–104, 2020

2020
[39]

Data assimilation for transient flow in geologic formations via ensemble kalman filter.Advances in Water Resources, 29(8):1107–1122, 2006

Yan Chen and Dongxiao Zhang. Data assimilation for transient flow in geologic formations via ensemble kalman filter.Advances in Water Resources, 29(8):1107–1122, 2006

2006
[40]

Ensemble smoother with multiple data assimilation

Alexandre A Emerick and Albert C Reynolds. Ensemble smoother with multiple data assimilation. Computers & Geosciences, 55:3–15, 2013

2013
[41]

Training effective deep reinforcement learning agents for real-time life-cycle production optimization.Journal of Petroleum Science and Engineering, 208:109766, 2022

Kai Zhang, Zhongzheng Wang, Guodong Chen, Liming Zhang, Yongfei Yang, Chuanjin Yao, Jian Wang, and Jun Yao. Training effective deep reinforcement learning agents for real-time life-cycle production optimization.Journal of Petroleum Science and Engineering, 208:109766, 2022

2022
[42]

Deep reinforcement learning for optimal well control in subsurface systems with uncertain geology.Journal of Computational Physics, 477:111945, 2023

Yusuf Nasir and Louis J Durlofsky. Deep reinforcement learning for optimal well control in subsurface systems with uncertain geology.Journal of Computational Physics, 477:111945, 2023

2023
[43]

Stabilizing transformers for reinforcement learning

Emilio Parisotto, Francis Song, Jack Rae, Razvan Pascanu, Caglar Gulcehre, Siddhant Jayakumar, Max Jaderberg, Raphael Lopez Kaufman, Aidan Clark, Seb Noury, et al. Stabilizing transformers for reinforcement learning. InInternational conference on machine learning, pages 7487–7498. PMLR, 2020

2020
[44]

Well placement optimization with the covari- ance matrix adaptation evolution strategy and meta-models.Computational Geosciences, 16:75–92, 2012

Zyed Bouzarkouna, Didier Yu Ding, and Anne Auger. Well placement optimization with the covari- ance matrix adaptation evolution strategy and meta-models.Computational Geosciences, 16:75–92, 2012

2012
[45]

A derivative-free methodology with local and global search for the constrained joint optimization of well locations and controls

Obiajulu J Isebor, Louis J Durlofsky, and David Echeverr´ ıa Ciaurri. A derivative-free methodology with local and global search for the constrained joint optimization of well locations and controls. Computational Geosciences, 18:463–482, 2014

2014
[46]

Yusuf Nasir, Wei Yu, and Kamy Sepehrnoori. Hybrid derivative-free technique and effective machine learning surrogate for nonlinear constrained well placement and production optimization.Journal of Petroleum Science and Engineering, 186:106726, 2020

2020
[47]

Application of a particle swarm optimization algorithm for determining optimum well location and type.Computational Geosciences, 14:183–198, 2010

J´ erˆ ome E Onwunalu and Louis J Durlofsky. Application of a particle swarm optimization algorithm for determining optimum well location and type.Computational Geosciences, 14:183–198, 2010. 28

2010
[48]

Optimal rate control under geologic uncertainty

Ahmed H Alhuthali, Akhil Datta-Gupta, Bevan Yuen, and Jerry P Fontanilla. Optimal rate control under geologic uncertainty. InSPE Improved Oil Recovery Conference?, pages SPE–113628. SPE, 2008

2008
[49]

Zhe Liu and Albert C Reynolds. A sequential-quadratic-programming-filter algorithm with a modified stochastic gradient for robust life-cycle optimization problems with nonlinear state constraints.SPE Journal, 25(04):1938–1963, 2020

1938
[50]

Optimization of production operations in petroleum fields

Pengju Wang, Michael Litvak, and Khalid Aziz. Optimization of production operations in petroleum fields. InSPE Annual Technical Conference and Exhibition?, pages SPE–77658. SPE, 2002

2002
[51]

Ensemble-based multiobjective optimization of on/off control devices under geological uncertainty

R-M-M Fonseca, Olwijn Leeuwenburgh, Ernesto Della Rossa, PM Van den Hof, and J-D-D Jansen. Ensemble-based multiobjective optimization of on/off control devices under geological uncertainty. SPE Reservoir Evaluation & Engineering, 18(04):554–563, 2015

2015
[52]

Improving the ensemble-optimization method through covariance-matrix adaptation.Spe Journal, 20(01):155–168, 2015

RM M Fonseca, Olwijn Leeuwenburgh, PMJ MJ Van den Hof, and JD D Jansen. Improving the ensemble-optimization method through covariance-matrix adaptation.Spe Journal, 20(01):155–168, 2015

2015
[53]

Im- proved sampling strategies for ensemble-based optimization.Computational Geosciences, 24:1057– 1069, 2020

KR Ramaswamy, RM Fonseca, Olwijn Leeuwenburgh, MM Siraj, and PMJ Van den Hof. Im- proved sampling strategies for ensemble-based optimization.Computational Geosciences, 24:1057– 1069, 2020

2020
[54]

Joint optimization of oil well placement and controls.Computational Geosciences, 16:1061–1079, 2012

Mathias C Bellout, David Echeverr´ ıa Ciaurri, Louis J Durlofsky, Bjarne Foss, and Jon Kleppe. Joint optimization of oil well placement and controls.Computational Geosciences, 16:1061–1079, 2012

2012
[55]

Lianlin Li, Behnam Jafarpour, and M Reza Mohammad-Khaninezhad. A simultaneous perturba- tion stochastic approximation algorithm for coupled well placement and control optimization under geologic uncertainty.Computational Geosciences, 17:167–188, 2013

2013
[56]

Joint optimization of number of wells, well locations and controls using a gradient-based algorithm.Chemical Engineering Research and Design, 92(7):1315– 1328, 2014

Fahim Forouzanfar and Albert C Reynolds. Joint optimization of number of wells, well locations and controls using a gradient-based algorithm.Chemical Engineering Research and Design, 92(7):1315– 1328, 2014

2014
[57]

A general method to select representative models for decision making and optimization under uncertainty.Computers & geosciences, 96:109–123, 2016

Mehrdad G Shirangi and Louis J Durlofsky. A general method to select representative models for decision making and optimization under uncertainty.Computers & geosciences, 96:109–123, 2016

2016
[58]

Closed-loop field development under uncertainty by use of optimization with sample validation.SPE Journal, 20(05):908–922, 2015

Mehrdad G Shirangi and Louis J Durlofsky. Closed-loop field development under uncertainty by use of optimization with sample validation.SPE Journal, 20(05):908–922, 2015

2015
[59]

Optimisa- tion of decision making under uncertainty throughout field lifetime: A fractured reservoir example

Dan Arnold, Vasily Demyanov, Mike Christie, Alexander Bakay, and Konstantin Gopa. Optimisa- tion of decision making under uncertainty throughout field lifetime: A fractured reservoir example. Computers & Geosciences, 95:123–139, 2016

2016
[60]

Reservoir development optimization under uncertainty for infill well placement in brownfield redevelopment.Journal of Petroleum Science and Engineering, 175:444–464, 2019

Junko Hutahaean, Vasily Demyanov, and Mike Christie. Reservoir development optimization under uncertainty for infill well placement in brownfield redevelopment.Journal of Petroleum Science and Engineering, 175:444–464, 2019

2019
[61]

Geophysical inversion with a neighbourhood algorithm—i

Malcolm Sambridge. Geophysical inversion with a neighbourhood algorithm—i. searching a param- eter space.Geophysical journal international, 138(2):479–494, 1999

1999
[62]

Geophysical inversion with a neighbourhood algorithm—ii

Malcolm Sambridge. Geophysical inversion with a neighbourhood algorithm—ii. appraising the en- semble.Geophysical Journal International, 138(3):727–746, 1999

1999
[63]

University of Cambridge, Department of Engineering Cambridge, UK, 1994

Gavin A Rummery and Mahesan Niranjan.On-line Q-learning using connectionist systems, vol- ume 37. University of Cambridge, Department of Engineering Cambridge, UK, 1994. 29

1994
[64]

Q-learning.Machine learning, 8:279–292, 1992

Christopher JCH Watkins and Peter Dayan. Q-learning.Machine learning, 8:279–292, 1992

1992
[65]

A reinforcement learning approach for waterflooding optimization in petroleum reservoirs.Engineering Applications of Artificial Intelligence, 77:98–116, 2019

Farzad Hourfar, Hamed Jalaly Bidgoly, Behzad Moshiri, Karim Salahshoor, and Ali Elkamel. A reinforcement learning approach for waterflooding optimization in petroleum reservoirs.Engineering Applications of Artificial Intelligence, 77:98–116, 2019

2019
[66]

Waterflooding optimization under geological uncertainties by using deep reinforcement learning algorithms

Hongze Ma, Gaoming Yu, Yuehui She, and Yongan Gu. Waterflooding optimization under geological uncertainties by using deep reinforcement learning algorithms. InSPE Annual Technical Conference and Exhibition?, page D031S043R001. SPE, 2019

2019
[67]

Deep reinforcement learning: reservoir optimization from pixels

Ruslan Miftakhov, Abdulaziz Al-Qasim, and Igor Efremov. Deep reinforcement learning: reservoir optimization from pixels. InInternational Petroleum Technology Conference, page D021S052R002. IPTC, 2020

2020
[68]

Proximal Policy Optimization Algorithms

John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. Proximal policy optimization algorithms.arXiv preprint arXiv:1707.06347, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[69]

Stochastic optimal well control in subsurface reservoirs using reinforcement learning.Engineering Applications of Artificial Intelligence, 114:105106, 2022

Atish Dixit and Ahmed H ElSheikh. Stochastic optimal well control in subsurface reservoirs using reinforcement learning.Engineering Applications of Artificial Intelligence, 114:105106, 2022

2022
[70]

Asynchronous methods for deep reinforcement learning.arXiv preprint arXiv:1602.01783,

Volodymyr Mnih. Asynchronous methods for deep reinforcement learning.arXiv preprint arXiv:1602.01783, 2016

work page arXiv 2016
[71]

Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor

Tuomas Haarnoja, Aurick Zhou, Pieter Abbeel, and Sergey Levine. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. InInternational conference on machine learning, pages 1861–1870. PMLR, 2018

2018
[72]

Deep reinforcement learning and adaptive policy transfer for generalizable well control optimization.Journal of Petroleum Science and Engineering, 217:110868, 2022

Zhongzheng Wang, Kai Zhang, Jinding Zhang, Guodong Chen, Xiaopeng Ma, Guojing Xin, Jinzheng Kang, Hanjun Zhao, and Yongfei Yang. Deep reinforcement learning and adaptive policy transfer for generalizable well control optimization.Journal of Petroleum Science and Engineering, 217:110868, 2022

2022
[73]

Evolutionary-assisted reinforcement learning for reservoir real-time production optimization under uncertainty.Petroleum Science, 20(1):261–276, 2023

Zhong-Zheng Wang, Kai Zhang, Guo-Dong Chen, Jin-Ding Zhang, Wen-Dong Wang, Hao-Chen Wang, Li-Ming Zhang, Xia Yan, and Jun Yao. Evolutionary-assisted reinforcement learning for reservoir real-time production optimization under uncertainty.Petroleum Science, 20(1):261–276, 2023

2023
[74]

Practical closed-loop reservoir management using deep reinforce- ment learning.SPE Journal, 28(03):1135–1148, 2023

Yusuf Nasir and Louis J Durlofsky. Practical closed-loop reservoir management using deep reinforce- ment learning.SPE Journal, 28(03):1135–1148, 2023

2023
[75]

Multi-asset closed-loop reservoir management using deep rein- forcement learning.Computational Geosciences, 28(1):23–42, 2024

Yusuf Nasir and Louis J Durlofsky. Multi-asset closed-loop reservoir management using deep rein- forcement learning.Computational Geosciences, 28(1):23–42, 2024

2024
[76]

Deep reinforcement learning for generalizable field development optimization.SPE Journal, 27(01):226–245, 2022

Jincong He, Meng Tang, Chaoshun Hu, Shusei Tanaka, Kainan Wang, Xian-Huan Wen, and Yusuf Nasir. Deep reinforcement learning for generalizable field development optimization.SPE Journal, 27(01):226–245, 2022

2022
[77]

Deep reinforcement learning for constrained field development optimization in subsurface two-phase flow

Yusuf Nasir, Jincong He, Chaoshun Hu, Shusei Tanaka, Kainan Wang, and XianHuan Wen. Deep reinforcement learning for constrained field development optimization in subsurface two-phase flow. Frontiers in Applied Mathematics and Statistics, 7:689934, 2021. 30

2021