pith. machine review for the scientific record. sign in

arxiv: 2604.22672 · v1 · submitted 2026-04-24 · 💻 cs.LG

Recognition: unknown

Iterative Model-Learning Scheme via Gaussian Processes for Nonlinear Model Predictive Control of (Semi-)Batch Processes

Authors on Pith no claims yet

Pith reviewed 2026-05-08 12:11 UTC · model grok-4.3

classification 💻 cs.LG
keywords Gaussian processesmodel learningnonlinear model predictive controlbatch processesiterative learningchance constraintssemi-batch reactordata-efficient control
0
0 comments X

The pith

An iterative scheme learns a Gaussian process model for nonlinear model predictive control of batch processes from one initial trajectory and matches full-model performance within a few runs.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that a Gaussian process surrogate, started with data from a single initial batch and then updated with observations from each subsequent controlled run, can support nonlinear model predictive control for transient nonlinear systems without a complete mechanistic model. Chance constraints built from the GP's uncertainty estimates keep operation safe at chosen confidence levels while the controller is applied repeatedly to the plant. In a simulated semi-batch polymerization reactor, tracking error fell 83 percent after four iterations and final product mass rose 17-fold by iteration eight, both outcomes reaching levels comparable to an NMPC that uses the true model. A reader would care because many batch processes lack reliable dynamic models yet stand to gain from advanced control if learning can be made both safe and sample-efficient by concentrating new data near improving trajectories.

Core claim

The GP-MLMPC scheme initializes a Gaussian process with data from one initial trajectory, embeds the GP into a nonlinear model predictive controller, applies the controller to new batches, and augments the GP with the resulting observations. Predictive uncertainty from the GP supplies chance constraints that enforce safety. On a semi-batch polymerization reactor over two-hour runs with temperature constrained within plus or minus two degrees Celsius of set point, the scheme reduces tracking error by 83 percent after four iterations and increases final product mass by a factor of 17 after eight iterations, attaining performance on par with full-model NMPC.

What carries the argument

The iterative GP-MLMPC loop, in which a Gaussian process surrogate updated batch-wise supplies both the prediction model and the uncertainty measure for chance-constrained nonlinear model predictive control.

If this is right

  • Performance of the learned controller converges to that of NMPC with the true mechanistic model.
  • New data remain sample-efficient because they are gathered around the successively better trajectory.
  • Chance constraints derived from GP uncertainty maintain safe operation without the full model.
  • The same scheme works for both setpoint-tracking and economic objectives in nonlinear batch systems.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The approach suggests that policy-improving data collection can substitute for prior mechanistic knowledge in other transient control problems.
  • Convergence speed may degrade if the initial trajectory is farther from optimum or if measurement noise is substantially higher.
  • The method could be tested on real pilot-scale reactors to check whether the simulated safety guarantees translate to hardware.

Load-bearing premise

That measurements collected near the improving trajectory will let the Gaussian process represent the plant's nonlinear dynamics well enough for the closed-loop controller to remain stable and respect safety limits.

What would settle it

After several iterations the closed-loop tracking error or product yield stays far below the full-model NMPC benchmark, or temperature violations occur even though the chance constraints are satisfied at the stated confidence level.

Figures

Figures reproduced from arXiv: 2604.22672 by Alexander Mitsos, Eike Cramer, Tai Xuan Tan.

Figure 1
Figure 1. Figure 1: Batch iteration learning scheme via GP-MLMPC. The inner loop represents feedback control within a batch duration. 5. GP-MLMPC on Semi-batch Polymerization Reactor We apply the proposed GP-MLMPC to a semi-batch polymerization reactor case study and show the improve￾ment over batch iterations with the proposed learning loop. We discuss the results for two different control objectives, namely setpoint-trackin… view at source ↗
Figure 2
Figure 2. Figure 2: Schematic of semi-batch polymerization reactor with external heat exchanger and cooling jacket, adapted from [5]. please refer to Appendix B and the original publication by Lucia et al. [5]. The process noise [] for measurements [] is sampled from normal distributions. We determine nominal standard deviation values from common equipment specifications for temperature measurements, , = 0.1 ◦C [25], and mass… view at source ↗
Figure 3
Figure 3. Figure 3: Reactor temperature, adiabatic temperature, mass of polymer, and jacket inlet temperature for tracking and economic NMPC using the perfect model without noise. Constraints are shown in red. Both tracking and economic objectives are investigated by simulating the process in silico, and are discussed in subsequent sections. The process is operated under a couple of important constraints. First, the reactor t… view at source ↗
Figure 4
Figure 4. Figure 4: Reactor temperature, adiabatic temperature, mass of polymer, and jacket inlet temperature from tracking GP-NMPC without chance constraints (a) and with chance constraints (b). Batch Iterations = 1 and = 10 are shown, with PI control as the benchmark. Each light dashed line represents a simulation run, and the solid line represents the median value. Constraints are shown in red view at source ↗
Figure 5
Figure 5. Figure 5: Root mean square deviation from setpoint of tracking MPC decreases over batch iterations to achieve on-par performance with full-model NMPC. PI and full model NMPC are shown as benchmarks. The shaded Region represents the 5th -95th percentile, and the solid line represents the median values. Subsection 3.2. In this scenario, the reactor temperature setpoint is positioned at the midpoint of the two inequali… view at source ↗
Figure 6
Figure 6. Figure 6: Reactor temperature, adiabatic temperature, mass of polymer, and jacket inlet temperature from economic GP-NMPC without chance constraints (a) and with chance constraints (b). Batch Iterations = 1 and = 10 are shown. Each light dashed line represents a simulation run, and the solid line represents the median values. Constraints are shown in red. performs well if the underlying kernel assumption matches the… view at source ↗
Figure 7
Figure 7. Figure 7: (a) Final polymer mass at end of batch duration of economic GP-MLMPC. (b) Mean constraint violation over the full batch duration of the system, comparing the effect of chance constraints (CC). PI and full model NMPC are shown as benchmarks. The shaded Region represents the 5th -95th percentile, and the solid line represents the mean values. the first few batches, in line with Fig. 6a. Even though the task … view at source ↗
read the original abstract

Batch processes are inherently transient and typically nonlinear, motivating nonlinear model predictive control (NMPC). However, adopting NMPC is hindered by the cost and unavailability of dynamic models. Thus, we propose to use Gaussian Processes (GP) in a model-learning NMPC scheme (GP-MLMPC) for batch processes. We initialize the GP-MLMPC using data from a single initial trajectory, e.g., from a PI controller. We iteratively apply the NMPC embedded with GPs to run batches and update the GP with new observations from each iteration, thereby achieving batch-wise improvements. Using uncertainty quantification from the GPs, we formulate chance constraints to enforce safe operation to the required confidence levels. We demonstrate our approach in \textit{silico} on a semi-batch polymerization reactor for tracking and economic objectives over durations of two hours, and the reactor temperature is constrained in a range of $\pm2^\circ C$ around its setpoint. After only four batch iterations, tracking error from the GP-MLMPC scheme converged to a reduction of $83\%$, compared to the initial trajectory. Furthermore, under an economic objective, the GP-MLMPC resulted in a 17-fold increase in final product mass by iteration 8, compared to the initial trajectory. In both cases, the resulting GP-MLMPC performance is on par with the full-model NMPC, which shows that the optimal controller can be learned by the approach. By collecting samples around the optimal trajectory, the GP-MLMPC remains sample-efficient across iterations and achieves quick convergence. Thus, the proposed GP-MLMPC scheme presents a promising data-efficient approach for the control of nonlinear batch processes without mechanistic knowledge.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes an iterative Gaussian Process (GP) model-learning scheme embedded in nonlinear model predictive control (GP-MLMPC) for (semi-)batch processes. The GP is initialized from data of a single initial trajectory (e.g., PI controller), then iteratively updated with new observations collected by running the chance-constrained NMPC on successive batches. Chance constraints are formulated using GP predictive uncertainty to enforce safety. In silico demonstrations on a semi-batch polymerization reactor (temperature constrained to ±2°C around setpoint) show that after four iterations the tracking error is reduced by 83% relative to the initial trajectory, and under an economic objective the final product mass increases 17-fold by iteration 8, reaching performance on par with full-model NMPC.

Significance. If the empirical results and safety properties hold, the approach provides a practical, data-efficient route to high-performance NMPC for nonlinear batch processes without requiring a full mechanistic model. The iterative, sample-efficient learning around improving trajectories and the use of GP-derived chance constraints for probabilistic safety are potentially valuable for industrial applications such as polymerization where model development is costly.

major comments (2)
  1. [GP-MLMPC scheme (methods)] The chance-constraint formulation (described in the GP-MLMPC scheme section) relies on GP posterior mean and variance to enforce safety, but provides no analysis or demonstration that the NMPC remains feasible and constraint-satisfying in the first 1–2 iterations when the GP is trained solely on the initial PI trajectory. In regions away from this data the predictive variance is large; standard mean + k·std back-offs can render the problem infeasible or leave insufficient protection if the true dynamics deviate, which would block the entire iterative loop. This is load-bearing for the central claim of reliable batch-wise improvement.
  2. [Simulation results (Section 5)] Simulation results (Section 5) report the 83% tracking-error reduction by iteration 4 and 17-fold product-mass increase by iteration 8, yet supply no details on GP kernel choice, hyperparameter optimization, exact update rule for the GP after each batch, or how the chance constraints are encoded inside the NMPC optimization (e.g., via back-off or scenario approach). Without these, the quantitative claims cannot be reproduced or assessed for robustness.
minor comments (2)
  1. [Abstract] The abstract states the reactor temperature is constrained “in a range of ±2°C around its setpoint” but does not specify the exact time horizon, sampling interval, or prediction horizon used in the NMPC.
  2. [Methods] Notation for the GP posterior mean and variance is introduced without an explicit equation reference; adding a numbered equation for the chance-constraint expression would improve clarity.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments and positive assessment of our work's potential significance. We address each major comment point by point below, with planned revisions to strengthen the manuscript.

read point-by-point responses
  1. Referee: [GP-MLMPC scheme (methods)] The chance-constraint formulation (described in the GP-MLMPC scheme section) relies on GP posterior mean and variance to enforce safety, but provides no analysis or demonstration that the NMPC remains feasible and constraint-satisfying in the first 1–2 iterations when the GP is trained solely on the initial PI trajectory. In regions away from this data the predictive variance is large; standard mean + k·std back-offs can render the problem infeasible or leave insufficient protection if the true dynamics deviate, which would block the entire iterative loop. This is load-bearing for the central claim of reliable batch-wise improvement.

    Authors: We agree that demonstrating feasibility of the chance-constrained NMPC in the initial iterations is essential, as this underpins the iterative improvement claim. Our simulations confirm that the NMPC remained feasible and satisfied the temperature constraints from the first iteration onward, enabling the reported performance gains without violations. To address this, we will revise the GP-MLMPC scheme section to include an explicit discussion of feasibility: specifically, how the GP variance-based back-offs (using a conservative multiplier on the predictive standard deviation) ensure the optimization problem remains feasible even with large initial uncertainty, by providing sufficient conservatism around the mean prediction. We will also add supporting simulation evidence, such as constraint violation metrics or feasibility status across the first two iterations. revision: yes

  2. Referee: [Simulation results (Section 5)] Simulation results (Section 5) report the 83% tracking-error reduction by iteration 4 and 17-fold product-mass increase by iteration 8, yet supply no details on GP kernel choice, hyperparameter optimization, exact update rule for the GP after each batch, or how the chance constraints are encoded inside the NMPC optimization (e.g., via back-off or scenario approach). Without these, the quantitative claims cannot be reproduced or assessed for robustness.

    Authors: We fully agree that these implementation details are necessary for reproducibility and robustness assessment. In the revised manuscript, we will expand both the methods description of the GP-MLMPC scheme and Section 5 to specify: the GP kernel (squared exponential with automatic relevance determination), the hyperparameter optimization method (maximum likelihood estimation via gradient descent), the exact GP update rule (incremental incorporation of new input-output pairs from each completed batch while retaining all prior data), and the chance-constraint encoding (deterministic back-off formulation using the GP posterior mean plus a multiple of the standard deviation, with the multiplier chosen for the target confidence level). These additions will directly support the reported quantitative results. revision: yes

Circularity Check

0 steps flagged

No significant circularity in the iterative GP-MLMPC scheme

full rationale

The paper proposes an iterative GP-based model-learning NMPC method initialized from a single PI-controller trajectory, with subsequent batches run under chance-constrained NMPC and the GP updated from new closed-loop data. All performance claims (83% tracking-error reduction by iteration 4, 17-fold product-mass increase by iteration 8, parity with full-model NMPC) are obtained from independent in-silico simulation experiments on the polymerization reactor; they are measured outcomes of executing the closed-loop system rather than quantities that reduce by construction to the fitted GP parameters or to any self-citation. No self-definitional equations, fitted-inputs relabeled as predictions, or load-bearing self-citation chains appear in the method description or validation. The derivation chain is therefore self-contained against external simulation benchmarks.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 0 invented entities

The central claim rests on standard assumptions from GP regression and MPC theory plus the domain assumption that iterative data collection around improving trajectories will yield adequate models.

free parameters (1)
  • GP hyperparameters
    Fitted to initial trajectory data and updated with new observations each iteration.
axioms (2)
  • domain assumption Gaussian Processes can adequately model the nonlinear dynamics of the batch process from limited data
    Core to the model-learning NMPC scheme described in the abstract.
  • domain assumption Chance constraints derived from GP uncertainty quantification can enforce safe operation
    Used to maintain temperature within ±2°C around setpoint.

pith-pipeline@v0.9.0 · 5610 in / 1318 out tokens · 54945 ms · 2026-05-08T12:11:04.485345+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

34 extracted references · 24 canonical work pages

  1. [1]

    H. Yoo, H. E. Byun, D. Han, J. H. Lee, Reinforcement learni ng for batch process control: Review and perspectives, Annual Rev iews in Control 52 (2021) 108–119. doi:10.1016/J.ARCONTROL.2021.10.006

  2. [2]

    J. H. Lee, K. S. Lee, ITERATIVE LEARNING CONTROL APPLIED TO BATCH PROCESSES: AN OVERVIEW, IFAC Proceedings Volumes 39 (2) (2006) 1037–1046. doi:10.3182/20060402-4-BR-2902.01037

  3. [3]

    J. B. Rawlings, D. Q. Mayne, M. Diehl, Model predictive co ntrol : theory, computation, and design, Nob Hill Publishing, 2017

  4. [4]

    Laurí, B

    D. Laurí, B. Lennox, J. Camacho, Model predictive contro l for batch processes: Ensuring validity of predictions, Journal of Pr ocess Con- trol 24 (1) (2014) 239–249. doi:10.1016/J.JPROCONT.2013.11.005

  5. [5]

    Lucia, J

    S. Lucia, J. A. Andersson, H. Brandt, M. Diehl, S. Engell, Handling uncertainty in economic nonlinear model predictive contro l: A com- parative case study, Journal of Process Control 24 (8) (2014 ) 1247–

  6. [6]

    doi:10.1016/j.jprocont.2014.05.008

  7. [7]

    Fiedler, B

    F. Fiedler, B. Karg, L. Lüken, D. Brandner, M. Heinlein, F . Braben- der, S. Lucia, do-mpc: Towards FAIR nonlinear and robust mod el predictive control, Control Engineering Practice 140 (202 3) 105676. doi:10.1016/J.CONENGPRAC.2023.105676

  8. [8]

    Marquez-Ruiz, M

    A. Marquez-Ruiz, M. Loonen, M. B. Saltlk, L. Özkan, Model Learning Predictive Control for Batch Processes: A Reactive Batch Distillation Column Case Study, Industrial and Engineering Chemistry Research 58 (30) (2019) 13737–13749 . doi:10.1021/ACS.IECR.8B06474/ASSET/IMAGES/MEDIUM/IE-2018-06474P_0015.GIF

  9. [9]

    C. E. Rasmussen, C. K. I. Williams, Gaussian Processes fo r Machine Learning, The MIT Press, 2006

  10. [10]

    Murray-Smith, D

    R. Murray-Smith, D. Sbarbaro, C. E. Rasmussen, A. Girard , Adaptive, cautious, predictive control with gaussian proc ess pri- ors, IFAC Proceedings Volumes 36 (16) (2003) 1155–1160. doi:10.1016/S1474-6670(17)34915-7

  11. [11]

    J. M. Maciejowski, X. Yang, Fault tolerant control usin g Gaussian processes and model predictive control, Conferen ce on Control and Fault-Tolerant Systems, SysTol (2013) 1– 12 doi:10.1109/SYSTOL.2013.6693820

  12. [12]

    E. D. Klenske, M. N. Zeilinger, B. Schölkopf, P. Hennig, Gaussian Process-Based Predictive Control for Periodic Error Correction, IEEE Transactions on Control Systems Technology 24 (1) (2016) 110–121. doi:10.1109/TCST.2015.2420629

  13. [13]

    Bradford, L

    E. Bradford, L. Imsland, D. Zhang, E. A. del Rio Chanona, Stochastic data-driven model predictive control using gau ssian pro- cesses, Computers & Chemical Engineering 139 (2020) 106844 . arXiv:1908.01786, doi:10.1016/J.COMPCHEMENG.2020.106844

  14. [14]

    Hewing, J

    L. Hewing, J. Kabzan, M. N. Zeilinger, Cautious Model Pr edic- tive Control Using Gaussian Process Regression, IEEE Trans ac- tions on Control Systems Technology 28 (6) (2020) 2736–2743 . arXiv:1705.10702, doi:10.1109/TCST.2019.2949757

  15. [15]

    Maiworm, D

    M. Maiworm, D. Limon, R. Findeisen, Online learning-ba sed model predictive control with Gaussian process models and stabil ity guar- antees, International Journal of Robust and Nonlinear Control 31 (18) (2021) 8785–8812. arXiv:1911.03315, doi:10.1002/RNC.5361

  16. [16]

    C. A. Micchelli, Y. Xu, H. Zhang, Universal Kernels, Jou rnal of Machine Learning Research 7 (2006) 2651–2667

  17. [17]

    Titsias, Variational learning of inducing variable s in sparse gaus- sian processes, in: D

    M. Titsias, Variational learning of inducing variable s in sparse gaus- sian processes, in: D. van Dyk, M. Welling (Eds.), Proceedin gs of the Twelfth International Conference on Artificial Intelli gence and Statistics, Vol. 5 of Proceedings of Machine Learning Resea rch, PMLR, Hilton Clearwater Beach Resort, Clearwater Beach, Fl orida USA, 2009, pp. 567–574

  18. [18]

    Mchutchon, C

    A. Mchutchon, C. E. Rasmussen, Gaussian Process Traini ng with Input Noise, Advances in Neural Information Processing Sys tems 24 (2011)

  19. [19]

    Girard, C

    A. Girard, C. E. Rasmussen, J. Q. Qui~, Q. Candela, R. Mur ray- Smith, Gaussian Process Priors with Uncertain Inputs Appli cation to Multiple-Step Ahead Time Series Forecasting, Advances in N eural Information Processing Systems 15 (2002)

  20. [20]

    J. A. Paulson, E. A. Buehler, R. D. Braatz, A. Mesbah, Stochastic model predictive control with joint chance cons traints, International Journal of Control 93 (1) (2020) 126–139. doi:10.1080/00207179.2017.1323351

  21. [21]

    Hewing, K

    L. Hewing, K. P. Wabersich, M. Menner, M. N. Zeilinger, Learning-Based Model Predictive Control: Toward Safe Learning in Control, Annual Review of Control, Robotics, an d Autonomous Systems 3 (Volume 3, 2020) (2020) 269–296. doi:10.1146/ANNUREV-CONTROL-090419-075625/CITE/REFW ORKS

  22. [22]

    da Silva, P

    B. da Silva, P. Dufour, N. Sheibat-Othman, S. Othman, Mo del Pre- dictive Control of Free Surfactant Concentration in Emulsi on Poly- merization, IFAC Proceedings Volumes 41 (2) (2008) 8375–83 80. doi:10.3182/20080706-5-KR-1001.01416

  23. [23]

    P. Joy, K. Rossow, F. Jung, H. U. Moritz, W. Pauer, A. Mit- sos, A. Mhamdi, Model-based control of continuous emulsion co- polymerization in a lab-scale tubular reactor, Journal of P rocess Control 75 (2019) 59–76. doi:10.1016/J.JPROCONT.2018.12.014

  24. [24]

    J. M. Faust, S. Hamzehlou, J. R. Leiza, J. M. Asua, A. Mham di, A. Mitsos, Closed-loop in-silico control of a two-stage emu lsion polymerization to obtain desired particle morphologies, C hem- ical Engineering Journal 414 (October 2020) (2021) 128808. doi:10.1016/j.cej.2021.128808

  25. [25]

    Rostampour, P

    V . Rostampour, P. Mohajerin Esfahani, T. Keviczky, Sto chastic Nonlinear Model Predictive Control of an Uncertain Batch Po ly- merization Reactor**This research was supported by the Net her- lands Organization for Scientific Research (NWO) under the g rant number 408-13-030., IFAC-PapersOnLine 48 (23) (2015) 540– 545. doi:10.1016/j.ifacol.2015.11.334

  26. [26]

    AN-021 - Determining Temperature Accuracy | Arroyo Ins truments (2026)

  27. [27]

    Types of Flow Sensors Used in Industrial Automation - Su pmea Automation Co.,Ltd (2026)

  28. [28]

    Nicholson, J

    B. Nicholson, J. D. Siirola, J. P. Watson, V . M. Zavala, L . T. Biegler, pyomo.dae: a modeling and automatic discretizati on frame- work for optimization with differential and algebraic equat ions, Mathematical Programming Computation 10 (2) (2018) 187–22 3. doi:10.1007/S12532-017-0127-0/FIGURES/4

  29. [29]

    Wächter, L

    A. Wächter, L. T. Biegler, On the implementation of an in terior- point filter line-search algorithm for large-scale nonline ar pro- gramming, Mathematical Programming 106 (1) (2006) 25–57. doi:10.1007/S10107-004-0559-Y/METRICS

  30. [30]

    GPy, GPy: A gaussian process framework in python, http://github.com/SheffieldML/GPy (since 2012)

  31. [31]

    J. A. E. Andersson, J. Gillis, G. Horn, J. B. Rawlings, M. Diehl, CasADi – A software framework for nonlinear optimization an d optimal control, Mathematical Programming Computation 11 (1) (2019) 1–36. doi:10.1007/s12532-018-0139-4

  32. [32]

    J. G. Ziegler, N. B. Nichols, Optimum Settings for Autom atic Controllers, Journal of Fluids Engineering 64 (8) (1942) 75 9–765. doi:10.1115/1.4019264

  33. [33]

    Myren, E

    S. Myren, E. Lawrence, A comparison of Gaussian process es and neural networks for computer model emulation and calibr a- tion, Statistical Analysis and Data Mining 14 (6) (2021) 606 –623. doi:10.1002/SAM.11507;WGROUP:STRING:PUBLICATION

  34. [34]

    M. A. Goldin, S. Virgili, M. Chalk, Scalable Gaussian process inference of neural responses to natural images, Proceedings of the National Academy of Sciences of the United States of America 120 (34) (2023) e2301150120. doi:10.1073/PNAS.2301150120;WEBSITE:WEBSITE:PNAS-SITE;WGROUP:STRING:PUBLICATION. T.X.Tan et al.: Preprint submitted to Elsevier Page 12 of 12