pith. machine review for the scientific record. sign in

arxiv: 2605.06756 · v1 · submitted 2026-05-07 · 💻 cs.LG · cs.SY· eess.SY

Recognition: no theorem link

Physics-based Digital Twins for Integrated Thermal Energy Systems Using Active Learning

Authors on Pith no claims yet

Pith reviewed 2026-05-11 01:54 UTC · model grok-4.3

classification 💻 cs.LG cs.SYeess.SY
keywords active learningdigital twinsthermal energy systemsSINDyCsurrogate modelingglycol heat exchangerModelicauncertainty quantification
0
0 comments X

The pith

Active learning cuts the simulation trajectories needed for accurate thermal energy digital twins to one-fifth.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper develops an active learning framework that pairs high-fidelity Modelica simulations of thermal systems with four lighter surrogate models: deterministic SINDyC, its probabilistic multivariate Gaussian extension, feedforward neural networks, and gated recurrent units. Each surrogate uses its own query strategy to select the next informative trajectory from the simulation pool, avoiding the need to run every possible operating condition. On the glycol heat exchanger subsystem, the method reaches the same accuracy on bypass flow rate and heat transfer rate as random sampling but requires only one-fifth as many full simulations. The result supports real-time control by delivering fast, interpretable, and uncertainty-aware models without the full computational cost of exhaustive data collection. Among the options, the recurrent network gives the best raw accuracy while the sparse identification approach stays fastest and most readable.

Core claim

The paper establishes that an active learning framework coupling system-level Modelica simulations with tailored surrogate models (SINDyC, MvG-SINDyC, FNN, and GRU) and model-specific query strategies achieves comparable predictive accuracy on the bypass mass flow rate and heat transfer rate outputs of the glycol heat exchanger while using as few as one-fifth the simulation trajectories required by random sampling.

What carries the argument

Active learning with model-specific query strategies (Mahalanobis-distance sampling in coefficient space for MvG-SINDyC and error-based sampling in prediction space for the remaining surrogates) that prioritize dynamically informative trajectories from the high-fidelity simulator.

If this is right

  • Real-time supervisory control of thermal distribution systems becomes feasible with surrogates that run quickly yet retain physics grounding.
  • The probabilistic MvG-SINDyC surrogate supplies uncertainty estimates while delivering the largest reduction in required simulations.
  • SINDyC remains the fastest and most interpretable option when computational resources or model transparency are priorities.
  • GRU networks reach the highest pointwise fidelity among the four surrogates tested.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same active-learning loop could be applied to other subsystems within integrated energy networks to lower the overall cost of building plant-wide digital twins.
  • Uncertainty outputs from the probabilistic variant could be fed directly into robust control algorithms that adjust setpoints when prediction spreads widen.
  • Testing the query strategies on different simulation platforms or additional energy components would show whether the one-fifth data reduction generalizes beyond the glycol heat exchanger.

Load-bearing premise

The chosen query strategies must pick trajectories that remain representative without introducing bias or losing accuracy on operating conditions the model has not yet seen.

What would settle it

Running the trained surrogates on a fresh set of operating conditions drawn from a different distribution and finding that the active-learning versions produce substantially larger errors than versions trained on the same number of randomly chosen trajectories.

Figures

Figures reproduced from arXiv: 2605.06756 by Linyu Lin, Majdi I. Radaideh, Paul Seurin, Umme Mahbuba Nabila.

Figure 1
Figure 1. Figure 1: Experimental TEDS facility, showing the heater, TES [PITH_FULL_IMAGE:figures/full_fig_p006_1.png] view at source ↗
Figure 4
Figure 4. Figure 4: Thermocline Energy Storage Tank [ [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Experimental and Modelica actuators and states of GHX. The blue lines represent the experimental data, the black curves [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: DT framework for autonomous control of TEDS. The ideal route (purple) corresponds to direct training on experimental [PITH_FULL_IMAGE:figures/full_fig_p009_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: AL workflow for physics-informed (SINDyC/MvG-SINDyC) and data-driven (FNN/GRU) surrogates. [PITH_FULL_IMAGE:figures/full_fig_p011_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: SINDyC model coefficient comparison. Thin lines: candidate models trained on individual sets of four simulation trajectories; [PITH_FULL_IMAGE:figures/full_fig_p013_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Comparison of the surrogate model predictions for three unseen simulation trajectories. [PITH_FULL_IMAGE:figures/full_fig_p014_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Comparison of surrogate model for experiment trajectory prediction. [PITH_FULL_IMAGE:figures/full_fig_p015_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Prediction error trends (RMSE) pertaining to the mass-flow and power output for different models. [PITH_FULL_IMAGE:figures/full_fig_p017_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Computational time comparison of each model. [PITH_FULL_IMAGE:figures/full_fig_p018_12.png] view at source ↗
read the original abstract

Real-time supervisory control of thermal energy distribution systems requires digital twins that are accurate, interpretable, and uncertainty-aware, yet remain data and computationally efficient. High-fidelity simulations alone are costly, while purely data-driven surrogates often lack robustness. To address these challenges, this work proposes an active learning (AL) framework that couples system-level Modelica simulations with four simpler physics-informed and data-driven surrogate modeling approaches: deterministic Sparse Identification of Nonlinear Dynamics with Control (SINDyC), its probabilistic multivariate-Gaussian extension (MvG-SINDyC), feedforward neural network (FNN), and gated recurrent unit (GRU) network. Tailored to each surrogate, model-specific AL query strategies are employed, including Mahalanobis-distance sampling in coefficient space for MvG-SINDyC and error-based sampling in prediction space for SINDyC, FNN, and GRU, allowing the learning process to prioritize dynamically informative trajectories. The proposed approach is demonstrated on the glycol heat exchanger (GHX) subsystem of the Thermal Energy Distribution System (TEDS) at Idaho National Laboratory. Across key GHX outputs--the bypass mass flow rate $\dot{m}_{\mathrm{GHX}}$ and heat transfer rate $Q_{\mathrm{GHX}}$-the AL framework achieves comparable predictive accuracy using as few as one-fifth of the simulation trajectories required by random sampling. Among the evaluated surrogates, the GRU achieves the highest predictive fidelity, while SINDyC remains the most computationally efficient and interpretable. The probabilistic MvG-SINDyC surrogate further enables uncertainty quantification and exhibits the largest computational gains under AL.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper proposes an active learning (AL) framework to develop data-efficient, physics-informed digital twins for thermal energy distribution systems. It integrates high-fidelity Modelica simulations with surrogate models including SINDyC, MvG-SINDyC, FNN, and GRU, using tailored query strategies such as Mahalanobis-distance sampling for MvG-SINDyC and error-based sampling for others. Demonstrated on the glycol heat exchanger (GHX) of the TEDS at Idaho National Laboratory, the framework claims to achieve comparable predictive accuracy for bypass mass flow rate and heat transfer rate using only one-fifth of the simulation trajectories needed by random sampling, with GRU offering highest fidelity and SINDyC best efficiency and interpretability.

Significance. If the empirical results hold, this approach offers a promising path toward computationally efficient and interpretable digital twins suitable for real-time supervisory control in integrated thermal energy systems. The combination of physics-based structure with active learning for data selection addresses key limitations of both pure simulation and black-box data-driven methods. The inclusion of uncertainty quantification via MvG-SINDyC is a notable strength, as is the focus on specific system outputs relevant to control.

major comments (2)
  1. [Results] The central claim that the AL framework achieves comparable accuracy on m_GHX and Q_GHX with 1/5 the trajectories of random sampling lacks supporting details on validation splits, number of independent runs, error bars, or statistical significance tests. Without these, it is difficult to assess whether the reported data-efficiency gain is robust.
  2. [Active Learning Query Strategies] The model-specific AL strategies (Mahalanobis distance in coefficient space for MvG-SINDyC, error-based for others) may introduce selection bias by over-representing certain dynamics. The manuscript should provide quantitative evidence, such as state-space coverage metrics or out-of-distribution generalization errors on unseen operating conditions, to confirm that the selected trajectories do not degrade surrogate performance on the full TEDS envelope.
minor comments (2)
  1. [Abstract] The abstract mentions performance gains but does not specify the exact metrics or conditions under which the 1/5 reduction holds.
  2. [Notation] Ensure consistent use of symbols for mass flow rate across the text and equations.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive feedback and positive assessment of the work's significance. We agree that additional statistical details and coverage analyses will strengthen the presentation of our results. We address each major comment below and will revise the manuscript accordingly.

read point-by-point responses
  1. Referee: [Results] The central claim that the AL framework achieves comparable accuracy on m_GHX and Q_GHX with 1/5 the trajectories of random sampling lacks supporting details on validation splits, number of independent runs, error bars, or statistical significance tests. Without these, it is difficult to assess whether the reported data-efficiency gain is robust.

    Authors: We acknowledge the value of these details for assessing robustness. In the revised manuscript we will explicitly state the validation protocol (80/20 trajectory split with no overlap between training and test sets), report all metrics as means over five independent runs with different random seeds for both AL and random sampling, include error bars as one standard deviation, and add a paired t-test (p < 0.05) confirming that the 1/5 data-efficiency advantage is statistically significant for both m_GHX and Q_GHX. These additions will be placed in Section 4.3 and the associated figures. revision: yes

  2. Referee: [Active Learning Query Strategies] The model-specific AL strategies (Mahalanobis distance in coefficient space for MvG-SINDyC, error-based for others) may introduce selection bias by over-representing certain dynamics. The manuscript should provide quantitative evidence, such as state-space coverage metrics or out-of-distribution generalization errors on unseen operating conditions, to confirm that the selected trajectories do not degrade surrogate performance on the full TEDS envelope.

    Authors: To demonstrate that the query strategies do not introduce harmful bias, the revised manuscript will include two new quantitative checks. First, we will report state-space coverage via the normalized convex-hull volume and the fraction of the full TEDS operating envelope covered by the AL-selected trajectories versus random sampling. Second, we will evaluate out-of-distribution generalization by holding out a separate test set of trajectories from extreme operating conditions (e.g., mass-flow rates and temperatures outside the training distribution) and report the resulting prediction errors for all four surrogates. These results will appear in a new subsection of Section 4.4 and will confirm that AL-selected data maintain or improve coverage and OOD performance relative to random sampling. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical AL vs. random sampling comparison on held-out trajectories

full rationale

The paper's central claims rest on running system-level Modelica simulations, training four surrogate classes (SINDyC, MvG-SINDyC, FNN, GRU) under both active-learning query strategies and random sampling, then measuring predictive accuracy on independent held-out GHX trajectories for m_GHX and Q_GHX. The reported 1/5 data-efficiency result is obtained by direct numerical comparison of test-set errors, not by any equation that reduces the output to a fitted parameter or self-citation by construction. No self-definitional loops, uniqueness theorems, or ansatz smuggling appear in the methodology.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review yields no explicit free parameters, axioms, or invented entities; the framework implicitly assumes standard surrogate modeling assumptions and that active learning query strategies generalize without bias.

pith-pipeline@v0.9.0 · 5612 in / 1114 out tokens · 32427 ms · 2026-05-11T01:54:10.459491+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

48 extracted references · 48 canonical work pages · 1 internal anchor

  1. [1]

    T. J. Morton, Thermal energy distribution system (teds) startup, Tech. rep., Idaho National Laboratory (INL), Idaho Falls, ID (United States) (2020)

  2. [2]

    K. L. Frick, S. M. Bragg-Sitton, C. Rabiti, Development of the inl thermal energy distribution system (teds) in the modelica eco-system for validation and verification, Tech. rep., Idaho National Laboratory (INL), Idaho Falls, ID (United States) (2020)

  3. [3]

    A. I. Arvanitidis, V. Agarwal, M. Alamaniotis, Nuclear-driven integrated energy systems: A state-of-the-art review, Energies 16 (11) (2023) 4293

  4. [4]

    S. M. Bragg-Sitton, R. Boardman, C. Rabiti, J. O’Brien, Reimagining future energy systems: Overview of the us program to maximize energy utilization via integrated nuclear-renewable energy systems, International Journal of Energy Research 44 (10) (2020) 8156–8169. 20

  5. [5]

    Seurin, L

    P. Seurin, L. Lin, Control under uncertainty for a physics-informed model of a thermal energy distribution system: Qualitative analysis, Available at SSRN 5667550

  6. [6]

    R. S. El-Emam, A. Constantin, R. Bhattacharyya, H. Ishaq, M. E. Ricotti, Nuclear and renewables in multi- purpose integrated energy systems: A critical review, Renewable and Sustainable Energy Reviews 192 (2024) 114157

  7. [7]

    Mikkelson, K

    D. Mikkelson, K. Frick, Analysis of controls for integrated energy storage system in energy arbitrage configura- tion with concrete thermal energy storage, Applied Energy 313 (2022) 118800

  8. [8]

    R. A. Jacob, J. Zhang, Modeling and control of nuclear–renewable integrated energy systems: Dynamic system model for green electricity and hydrogen production, Journal of Renewable and Sustainable Energy 15 (4) (2023) 046302

  9. [9]

    Hills, S

    S. Hills, S. Dana, H. Wang, Dynamic modeling and simulation of nuclear hybrid energy systems using freeze desalination and reverse osmosis for clean water production, Energy Conversion and Management 247 (2021) 114724

  10. [10]

    G. C. Masotti, A. Cammi, S. Lorenzi, M. E. Ricotti, Modeling and simulation of nuclear hybrid energy systems architectures, Energy Conversion and Management 298 (2023) 117684

  11. [11]

    S. L. Luxembourg, S. S. Salim, K. Smekens, F. D. Longa, B. van der Zwaan, Times-europe: An integrated en- ergy system model for analyzing europe’s energy and climate challenges, Environmental Modeling & Assessment 30 (1) (2025) 1–19

  12. [12]

    Williams, J

    L. Williams, J. M. Doster, D. Mikkelson, Modeling and optimization of a nuclear integrated energy system for the remote microgrid on el hierro, Energies 17 (23) (2024) 5826

  13. [13]

    Gautam, A

    S. Gautam, A. Szczublewski, A. Fox, S. Mahmud, A. Javaid, T. O. Olowu, T. Westover, R. Khanna, Digital real-time simulation and power quality analysis of a hydrogen-generating nuclear-renewable integrated energy system, Energies 18 (4) (2025) 937

  14. [14]

    Lin, Development of supervisory control system for thermal energy distribution system, in: 2024 Pacific Basin Nuclear Conference, PBNC 2024, American Nuclear Society, 2024, pp

    L. Lin, Development of supervisory control system for thermal energy distribution system, in: 2024 Pacific Basin Nuclear Conference, PBNC 2024, American Nuclear Society, 2024, pp. 435–444

  15. [15]

    L. Lin, J. Oncken, V. Agarwal, Autonomous control for heat-pipe microreactor using data-driven model pre- dictive control, Annals of Nuclear Energy 200 (2024) 110399

  16. [16]

    Kouvaritakis, M

    B. Kouvaritakis, M. Cannon, Model predictive control, Switzerland: Springer International Publishing 38 (13-

  17. [17]

    Seurin, L

    P. Seurin, L. Lin, Uncertainty quantification of a physics-informed model based on sparse identification of a thermal energy distribution system, Annals of Nuclear Energy 226 (2026) 111865

  18. [18]

    K. L. Frick, S. M. Bragg-Sitton, M. Garrouste, Validation and verification for inl modelica-based teds models via experimental results, Tech. rep., Idaho National Lab.(INL), Idaho Falls, ID (United States) (2021)

  19. [19]

    S. L. Brunton, J. L. Proctor, J. N. Kutz, Discovering governing equations from data by sparse identification of nonlinear dynamical systems, Proceedings of the national academy of sciences 113 (15) (2016) 3932–3937

  20. [20]

    Kaiser, J

    E. Kaiser, J. N. Kutz, S. L. Brunton, Sparse identification of nonlinear dynamics for model predictive control in the low-data limit, Proceedings of the Royal Society A 474 (2219) (2018) 20180335

  21. [21]

    A. A. Kaptanoglu, B. M. de Silva, U. Fasel, K. Kaheman, A. J. Goldschmidt, J. L. Callaham, C. B. Delahunt, Z. G. Nicolaou, K. Champion, J.-C. Loiseau, et al., Pysindy: A comprehensive python package for robust sparse system identification, arXiv preprint arXiv:2111.08481

  22. [22]

    De Maesschalck, D

    R. De Maesschalck, D. Jouan-Rimbaud, D. L. Massart, The mahalanobis distance, Chemometrics and intelli- gent laboratory systems 50 (1) (2000) 1–18

  23. [23]

    LeCun, Y

    Y. LeCun, Y. Bengio, G. Hinton, Deep learning, nature 521 (7553) (2015) 436–444. 21

  24. [24]

    Goodfellow, Y

    I. Goodfellow, Y. Bengio, A. Courville, Y. Bengio, Deep learning, Vol. 1, MIT press Cambridge, 2016

  25. [25]

    Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling

    J. Chung, C. Gulcehre, K. Cho, Y. Bengio, Empirical evaluation of gated recurrent neural networks on sequence modeling, arXiv preprint arXiv:1412.3555

  26. [26]

    M. I. Radaideh, C. Pigg, T. Kozlowski, Y. Deng, A. Qu, Neural-based time series forecasting of loss of coolant accidents in nuclear power plants, Expert Systems with Applications 160 (2020) 113699

  27. [27]

    M. I. Radaideh, T. Kozlowski, Combining simulations and data with deep learning and uncertainty quantifi- cation for advanced energy modeling, International Journal of Energy Research 43 (14) (2019) 7866–7890

  28. [28]

    R. A. Saleem, M. I. Radaideh, T. Kozlowski, Application of deep neural networks for high-dimensional large bwr core neutronics, Nuclear Engineering and Technology 52 (12) (2020) 2709–2716

  29. [29]

    M. I. Radaideh, T. Kozlowski, Surrogate modeling of advanced computer simulations using deep gaussian processes, Reliability Engineering & System Safety 195 (2020) 106731

  30. [30]

    S. Sene, L. Lin, J. Kim, M. I. Radaideh, Surrogate-driven variance-based sensitivity analysis of thermal storage tanks in integrated energy systems,In: Nuclear Plant Instrumentation and Control & Human-Machine Interface Technology (NPIC&HMIT 2025), Chicago, Illinois, United States, June 15–18, 2025

  31. [31]

    Price, M

    D. Price, M. I. Radaideh, B. Kochunas, Simplified matching pursuits applied to 3d nuclear reactor temperature distribution construction, Applied Mathematical Modelling 131 (2024) 134–158

  32. [32]

    M. G. Prina, M. Dallapiccola, D. Moser, W. Sparber, Machine learning as a surrogate model for energyplan: Speeding up energy system optimization at the country level, Energy 307 (2024) 132735

  33. [33]

    Lédée, C

    F. Lédée, C. Crawford, R. Evins, Improved surrogate modeling for multi-energy system design: Model archi- tecture, sampling and scaling choices, Applied Energy 390 (2025) 125812

  34. [34]

    Y. Li, M. Han, M. Shahidehpour, J. Li, C. Long, Data-driven distributionally robust scheduling of commu- nity integrated energy systems with uncertain renewable generations considering integrated demand response, Applied Energy 335 (2023) 120749

  35. [35]

    Y.Zhou, H.Hou, H.Yan, X.Wang, R.Zhou, Data-drivendistributionallyrobuststochasticoptimaldispatching method of integrated energy system considering multiple uncertainties, Energy 325 (2025) 136104

  36. [36]

    M. I. Radaideh, I. Wolverton, J. Joseph, J. J. Tusar, U. Otgonbaatar, N. Roy, B. Forget, K. Shirvan, Physics- informed reinforcement learning optimization of nuclear assembly design, Nuclear Engineering and Design 372 (2021) 110966

  37. [37]

    M. I. Radaideh, K. Shirvan, Pesa: Prioritized experience replay for parallel hybrid evolutionary and swarm algorithms-application to nuclear fuel, Nuclear Engineering and Technology 54 (10) (2022) 3864–3877

  38. [38]

    M. I. Radaideh, K. Du, P. Seurin, D. Seyler, X. Gu, H. Wang, K. Shirvan, Neorl: Neuroevolution optimization with reinforcement learning—applications to carbon-free energy systems, Nuclear Engineering and Design 412 (2023) 112423

  39. [39]

    Zhang, W

    B. Zhang, W. Hu, X. Xu, Z. Zhang, Z. Chen, Hybrid data-driven method for low-carbon economic energy management strategy in electricity-gas coupled energy systems based on transformer network and deep rein- forcement learning, Energy 273 (2023) 127183

  40. [40]

    Tunkle, K

    L. Tunkle, K. Abdulraheem, L. Lin, M. I. Radaideh, Nuclear microreactor transient and load-following control with deep reinforcement learning, Energy Conversion and Management: X (2025) 101090

  41. [41]

    M. I. Radaideh, L. Tunkle, D. Price, K. Abdulraheem, L. Lin, M. Elias, Multistep criticality search and power shaping in nuclear microreactors with deep reinforcement learning, Nuclear Science and Engineering (2025) 1–13

  42. [42]

    L. A. Burnett, U. M. Nabila, M. I. Radaideh, Variational digital twins, arXiv preprint arXiv:2507.01047. 22

  43. [43]

    Radaideh, C

    M. Radaideh, C. Pappas, P. Ramuhalli, S. Cousineau, Application of convolutional and feedforward neural networks for fault detection in particle accelerator power systems, in: Annual Conference of the PHM Society, Vol. 14, 2022

  44. [44]

    M. I. Radaideh, C. Pappas, J. Walden, D. Lu, L. Vidyaratne, T. Britton, K. Rajput, M. Schram, S. Cousineau, Time series anomaly detection in power electronics signals with recurrent and convlstm autoencoders, Digital Signal Processing 130 (2022) 103704

  45. [45]

    M. I. Radaideh, C. Pappas, M. Wezensky, P. Ramuhalli, S. Cousineau, Early fault detection in particle accel- erator power electronics using ensemble learning, International Journal of Prognostics and Health Management 14 (1)

  46. [46]

    Alanazi, M

    Y. Alanazi, M. Schram, K. Rajput, S. Goldenberg, L. Vidyaratne, C. Pappas, M. I. Radaideh, D. Lu, P. Ra- muhalli, S. Cousineau, Multi-module-based cvae to predict hvcm faults in the sns accelerator, Machine Learning with Applications 13 (2023) 100484

  47. [47]

    S. Qin, J. S. Yoo, T. J. Morton, Thermal stress modeling and analysis of packed-bed thermocline energy storage tank for inl thermal energy distribution system (teds), Tech. rep., Idaho National Lab.(INL), Idaho Falls, ID (United States) (2022)

  48. [48]

    U. M. Nabila, P. Seurin, L. Lin, M. I. Radaideh, Active learning for uncertainty quantification of a physics- informed digital twin of a thermal energy distribution system, in: Proc. International Conference on the Physics of Reactors (PHYSOR 2026), Torino, Italy, April 19–23, 2026. 23