arxiv: 2604.20707 · v1 · submitted 2026-04-22 · 💻 cs.LG · cs.SY· eess.SY

Recognition: unknown

Generative Flow Networks for Model Adaptation in Digital Twins of Natural Systems

Pascal Archambault , Houari Sahraoui , Eugene Syriani

Authors on Pith no claims yet

Pith reviewed 2026-05-10 01:15 UTC · model grok-4.3

classification 💻 cs.LG cs.SYeess.SY

keywords GFlowNetdigital twinsmodel adaptationsimulation-based inferenceparameter calibrationuncertainty quantificationmechanistic simulatorsgenerative modeling

0 comments

The pith

GFlowNets sample simulator parameters for digital twins proportional to their agreement with observations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows how Generative Flow Networks can handle model adaptation when digital twins must track natural systems that change over time and are only partially observed. Multiple parameter sets often fit the same limited data, so the task becomes one of generating complete configurations rather than locating a single best fit. The approach trains a policy to produce parameter sets with probability matching a reward score computed from how closely the simulator outputs match real observations. In a tomato growth model example, the resulting samples cover the main regions where good fits exist and keep several viable options active instead of collapsing to one.

Core claim

We formulate adaptation as a generative modeling problem over complete simulator configurations, so that plausible parameterizations can be sampled with probability proportional to a reward derived from agreement between simulated and observed behavior. Using a controlled environment agriculture case study based on a mechanistic tomato model, we show that the learned policy recovers dominant regions of the adaptation landscape, retrieves strong calibration hypotheses, and preserves multiple plausible configurations under uncertainty.

What carries the argument

The GFlowNet, a generative model that learns a policy to produce complete simulator configurations with probability proportional to a reward measuring agreement between simulated and observed outputs.

If this is right

The sampling process identifies multiple strong calibration hypotheses consistent with sparse observations.
Dominant regions of the adaptation landscape are recovered without exhaustive search.
Plausible configurations remain available even when observations leave the calibration underdetermined.
The method works directly with mechanistic simulators whose internal parameters cannot be measured.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same generative sampling could support ensemble-style predictions in digital twins by running forward simulations from many sampled configurations at once.
It offers an alternative to traditional optimization methods in any simulation-based inference setting where the posterior has multiple modes.
Extending the reward to include forward prediction accuracy on held-out data could test whether the sampled set improves long-term tracking of the physical system.

Load-bearing premise

A reward function based solely on agreement between simulated and observed behavior is sufficient to guide the GFlowNet to cover all relevant modes without mode collapse or prohibitive sampling cost.

What would settle it

On synthetic data generated from several known distinct parameter sets that all produce identical observation matches, the GFlowNet samples must include all those sets at rates proportional to their reward scores.

read the original abstract

Digital twins of natural systems must remain aligned with physical systems that evolve over time, are only partially observed, and are typically modeled by mechanistic simulators whose parameters cannot be measured directly. In such settings, model adaptation is naturally posed as a simulation-based inference problem. However, sparse and indirect observations often fail to identify a unique and optimal calibration, leaving several simulator parameterizations compatible with the available evidence. This article presents a GFlowNet-based approach to model adaptation for digital twins of natural systems. We formulate adaptation as a generative modeling problem over complete simulator configurations, so that plausible parameterizations can be sampled with probability proportional to a reward derived from agreement between simulated and observed behavior. Using a controlled environment agriculture case study based on a mechanistic tomato model, we show that the learned policy recovers dominant regions of the adaptation landscape, retrieves strong calibration hypotheses, and preserves multiple plausible configurations under uncertainty.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper applies GFlowNets to sample multiple plausible parameter sets for digital-twin simulator adaptation, which is a clean new framing, but the abstract gives no experimental details to check whether it actually avoids mode collapse.

read the letter

The core contribution is treating model adaptation as sampling complete simulator configurations from a GFlowNet whose probability is proportional to a reward based on simulated-versus-observed agreement. They demonstrate the idea on a mechanistic tomato model in controlled-environment agriculture, claiming the policy finds dominant regions of the adaptation landscape while keeping several compatible parameterizations alive under uncertainty. That framing is new enough to stand out from standard single-point calibration or basic MCMC approaches in the digital-twin literature. The problem setup itself is handled cleanly: partial observability, evolving systems, and the need for multiple hypotheses are stated without overclaiming uniqueness of the solution. The reward is defined externally from simulator-observation match rather than being circular, which keeps the formulation honest. The main weakness is that none of the supporting evidence is visible. The abstract asserts recovery of dominant regions and preservation of multiple configurations, yet supplies no reward-function form, parameter dimensionality, number of trajectories, temperature schedule, or any diagnostic such as effective sample size, mode coverage metric, or comparison against SMC or MCMC. In a continuous, plausibly high-dimensional configuration space, GFlowNets are known to require careful flow-matching and exploration to avoid collapse, so the central claim rests on an untested assumption. Without those details or ablations it is impossible to tell whether the reported behavior is real or an artifact of the chosen landscape. This paper is aimed at researchers working on simulation-based inference or digital twins for natural systems who already know GFlowNets. A reader looking for a fresh application framing will find it useful, but anyone needing reproducible evidence or practical guidance will have to wait for the full experiments. I would send it to peer review because the idea is coherent and the target problem is real; referees can usefully press on the missing implementation checks and help turn the claim into something verifiable.

Referee Report

2 major / 2 minor

Summary. The paper proposes a GFlowNet-based generative modeling approach to model adaptation for digital twins of natural systems. Adaptation is cast as sampling complete simulator parameter configurations with probability proportional to a reward derived from agreement between simulated outputs and sparse observations. In a controlled-environment agriculture case study using a mechanistic tomato growth model, the authors report that the learned policy recovers dominant regions of the adaptation landscape, retrieves strong calibration hypotheses, and preserves multiple plausible configurations under uncertainty.

Significance. If the empirical claims hold, the work provides a practical method for sampling from multi-modal posteriors in simulation-based inference for digital twins, where unique calibration is often impossible due to partial observability. GFlowNets' ability to sample proportionally to unnormalized rewards without requiring normalized densities is a natural fit here, and the use of a real mechanistic simulator strengthens relevance to natural systems modeling. This could enable more robust ensemble predictions in applications like agriculture or ecology by maintaining diversity in plausible parameterizations.

major comments (2)

[§4] §4 (Case Study and Experiments): The central claim that the learned policy recovers dominant regions and preserves multiple plausible configurations requires evidence that the empirical distribution of samples approximates p(config) ∝ R(agreement(sim(config), obs)). However, the manuscript provides no details on the dimensionality of the parameter vector, the precise functional form of the reward R (e.g., likelihood, distance metric, or annealing schedule), the number of trajectories used in training, or any diagnostic such as effective sample size, mode-counting metric, or direct comparison to MCMC/SMC baselines. This leaves the key assumption that the GFlowNet training dynamics avoid mode collapse on this landscape unverified.
[§3.1] §3.1 (GFlowNet Formulation): While the proportionality guarantee holds in theory when the flow-matching objective is satisfied, the manuscript does not discuss or report any exploration bonuses, temperature scheduling, or flow-matching diagnostics applied to this continuous, plausibly high-dimensional configuration space. Without these, it is unclear whether the reported recovery of multiple configurations is robust or an artifact of the specific reward landscape in the tomato model.

minor comments (2)

[Abstract] The abstract and introduction could more explicitly reference the specific equations defining the reward and the GFlowNet policy to improve traceability for readers.
[§2] Notation for simulator configurations and the reward function is introduced without a dedicated table or equation block, which could be clarified for reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive comments. We are pleased that the significance of the work for simulation-based inference in digital twins is recognized. We address the major comments below and will revise the manuscript accordingly to provide the requested details and clarifications.

read point-by-point responses

Referee: [§4] §4 (Case Study and Experiments): The central claim that the learned policy recovers dominant regions and preserves multiple plausible configurations requires evidence that the empirical distribution of samples approximates p(config) ∝ R(agreement(sim(config), obs)). However, the manuscript provides no details on the dimensionality of the parameter vector, the precise functional form of the reward R (e.g., likelihood, distance metric, or annealing schedule), the number of trajectories used in training, or any diagnostic such as effective sample size, mode-counting metric, or direct comparison to MCMC/SMC baselines. This leaves the key assumption that the GFlowNet training dynamics avoid mode collapse on this landscape unverified.

Authors: We agree with the referee that these implementation and validation details are necessary to fully support the central claims. The revised manuscript will include the dimensionality of the parameter vector, the precise functional form of the reward R including any annealing schedule, the number of trajectories used in training, and additional diagnostics such as effective sample size, mode-counting metrics, and comparisons to MCMC/SMC baselines. These additions will help verify that the GFlowNet training avoids mode collapse on the adaptation landscape. revision: yes
Referee: [§3.1] §3.1 (GFlowNet Formulation): While the proportionality guarantee holds in theory when the flow-matching objective is satisfied, the manuscript does not discuss or report any exploration bonuses, temperature scheduling, or flow-matching diagnostics applied to this continuous, plausibly high-dimensional configuration space. Without these, it is unclear whether the reported recovery of multiple configurations is robust or an artifact of the specific reward landscape in the tomato model.

Authors: We thank the referee for this observation. In the revised version, we will expand the discussion in §3.1 to cover the exploration bonuses and temperature scheduling employed during training, as well as report flow-matching diagnostics for the continuous parameter space. While the empirical results in the tomato case study support the robustness of the approach, these additions will provide further evidence that the recovery of multiple configurations is not an artifact of the specific landscape. revision: yes

Circularity Check

0 steps flagged

No circularity in GFlowNet-based adaptation formulation

full rationale

The paper poses model adaptation as sampling simulator configurations with probability proportional to an externally supplied reward R derived from simulator-observation agreement. This reward is treated as an independent input rather than being defined in terms of the learned policy or GFlowNet outputs. The central claims about recovering dominant regions and preserving multiple configurations are presented as outcomes of the case study experiments, not as identities or fitted predictions that reduce to the training objective by construction. No self-citation chains, uniqueness theorems imported from prior author work, or ansatz smuggling appear in the derivation. The formulation remains self-contained against the stated assumptions.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the standard mathematical property that GFlowNets can be trained to sample trajectories with probability proportional to a given reward; no free parameters or invented entities are introduced in the abstract.

axioms (1)

standard math GFlowNets can be trained to sample from a distribution proportional to a given reward function over complete configurations.
This is the core property invoked when the paper formulates adaptation as generative modeling with a reward derived from simulator-observation agreement.

pith-pipeline@v0.9.0 · 5456 in / 1247 out tokens · 31313 ms · 2026-05-10T01:15:10.714443+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

40 extracted references · 4 canonical work pages

[1]

Zhou, J., Zhang, S. & Gu, M. Revisiting digital twins: Origins, fundamentals, and practices.Frontiers of Engineering Management9, 668–676 (2022)

2022
[2]

& Suman, R

Javaid, M., Haleem, A. & Suman, R. Digital twin applications toward industry 4.0: A review.Cognitive Robotics3, 71–92 (2023)

2023
[3]

In: 2023 ACM/IEEE 26th International Conference on Model Driven Engineering Languages and Systems (MODELS)

David, I.et al.Digital twins for cyber-biophysical systems: Challenges and lessons learned. 2023 ACM/IEEE 26th International Conference on Model Driven Engineering Languages and Systems (MODELS) (2023). Pp. 1–12. IEEE. doi: 10.1109/MODELS58315.2023.00014

work page doi:10.1109/models58315.2023.00014 2023
[4]

Katsoulakis, E.et al.Digital twins for health: a scoping review.NPJ digital medicine7, 77 (2024)

2024
[5]

& Dinh, N

Lin, L., Bao, H. & Dinh, N. Uncertainty quantification and software risk analysis for digital twins in the nearly autonomous management and control systems: A review.Annals of Nuclear Energy160, 108362 (2021). 34

2021
[6]

& Russo, G

Gallo, L., Frasca, M., Latora, V. & Russo, G. Lack of practical identifiability may hamper reliable predictions in covid-19 epidemic models.Science advances 8, eabg5234 (2022)

2022
[7]

Bengio, Y.et al.Gflownet foundations.Journal of Machine Learning Research 24, 1–55 (2023)

2023
[8]

& Van Henten, E

Vanthoor, B., De Visser, P., Stanghellini, C. & Van Henten, E. J. A methodology for model-based greenhouse design: Part 2, description and validation of a tomato yield model.Biosystems engineering110, 378–395 (2011)

2011
[9]

G., Pretorius, J

Kapteyn, M. G., Pretorius, J. V. R. & Willcox, K. E. A probabilistic graph- ical model foundation for enabling predictive digital twins at scale.Nature Computational Science1, 337–347 (2021)

2021
[10]

Michael, J.et al.Model-driven engineering for digital twins: Opportunities and challenges.Systems Engineering28, 659–670 (2025)

2025
[11]

de Koning, K.et al.Digital twins: dynamic model-data fusion for ecology.Trends in ecology & evolution38, 916–926 (2023)

2023
[12]

& Cockrell, C

An, G. & Cockrell, C. Drug development digital twins for drug discovery, test- ing and repurposing: a schema for requirements and development.Frontiers in systems biology2, 928387 (2022)

2022
[13]

& Athanasiadis, I

Pylianidis, C., Osinga, S. & Athanasiadis, I. N. Introducing digital twins to agriculture.Computers and Electronics in Agriculture184, 105942 (2021)

2021
[14]

& Neubauer, T

Purcell, W. & Neubauer, T. Digital twins in agriculture: A state-of-the-art review. Smart Agricultural Technology3, 100094 (2023)

2023
[15]

Structural and multidisciplinary optimization66, 1 (2023)

Thelen, A.et al.A comprehensive review of digital twin—part 2: roles of uncer- tainty quantification and optimization, a battery digital twin, and perspectives. Structural and multidisciplinary optimization66, 1 (2023)

2023
[16]

& Bengio, Y

Bengio, E., Jain, M., Korablyov, M., Precup, D. & Bengio, Y. Flow network based generative models for non-iterative diverse candidate generation.Advances in neural information processing systems34, 27381–27394 (2021)

2021
[17]

Proceedings of the 40th Interna- tional Conference on Machine Learning (ICML) (2023)

Jain, M.et al.Multi-objective GFlowNets. Proceedings of the 40th Interna- tional Conference on Machine Learning (ICML) (2023). Proceedings of Machine Learning Research, Vol. 202, pp. 14631–14653. PMLR

2023
[18]

Uncertainty in Artificial Intelligence (2022)

Deleu, T.et al.Bayesian structure learning with generative flow networks. Uncertainty in Artificial Intelligence (2022). Pp. 518–528. PMLR

2022
[19]

J.et al.Amortizing intractable inference in large language models.arXiv preprint arXiv:2310.04363(2023)

Hu, E. J.et al.Amortizing intractable inference in large language models.arXiv preprint arXiv:2310.04363(2023). 35

work page arXiv 2023
[20]

Blair, G. S. Digital twins of the natural environment.Patterns2(2021)

2021
[21]

& Louppe, G

Cranmer, K., Brehmer, J. & Louppe, G. The frontier of simulation-based inference.Proceedings of the National Academy of Sciences117, 30055–30062 (2020)

2020
[22]

T., Uryasev, S.et al.Optimization of conditional value-at-risk

Rockafellar, R. T., Uryasev, S.et al.Optimization of conditional value-at-risk. Journal of risk2, 21–42 (2000)

2000
[23]

& Bengio, Y

Malkin, N., Jain, M., Bengio, E., Sun, C. & Bengio, Y. Trajectory balance: Improved credit assignment in gflownets.Advances in Neural Information Processing Systems35, 5955–5967 (2022)

2022
[24]

Eramo, R.et al.Conceptualizing digital twins.IEEE Software39, 39–46 (2022)

2022
[25]

& Chauhan, N

Subeesh, A. & Chauhan, N. Agricultural digital twin for smart farming: A review. Green Technologies and Sustainability100299 (2025)

2025
[26]

& Righini, I

Hemming, S., de Zwart, F., Elings, A., Petropoulou, A. & Righini, I. Autonomous greenhouse challenge, second edition (2019) (2020). URL https://doi.org/10. 4121/uuid:88d22c60-21b3-4ea8-90db-20249a5be2a7

2019
[27]

arXiv preprint arXiv:2304.11127 , year=

Watanabe, S. Tree-structured parzen estimator: Understanding its algorithm components and their roles for better empirical performance.arXiv preprint arXiv:2304.11127(2023)

work page arXiv 2023
[28]

& Koyama, M

Akiba, T., Sano, S., Yanase, T., Ohta, T. & Koyama, M. Optuna: A next- generation hyperparameter optimization framework. Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (2019). Pp. 2623–2631

2019
[29]

& Wimmer, M

Bordeleau, F., Combemale, B., Eramo, R., van den Brand, M. & Wimmer, M. Towards model-driven digital twin engineering: Current opportunities and future challenges. Systems Modelling and Management (ICSMM 2020) (2020). Commu- nications in Computer and Information Science, Vol. 1262, pp. 43–54. Springer. doi: 10.1007/978-3-030-58167-1 4

work page doi:10.1007/978-3-030-58167-1 2020
[30]

Lehner, D.et al.Model-driven engineering for digital twins: a systematic mapping study.Software and Systems Modeling24, 1339–1377 (2025)

2025
[31]

Dalibor, M.et al.A cross-domain systematic mapping study on software engineering for digital twins.Journal of Systems and Software193, 111361 (2022)

2022
[32]

Kamburjan, E., Bencomo, N., Tapia Tarifa, S. L. & Johnsen, E. B. Declara- tive lifecycle management in digital twins. Proceedings of the ACM/IEEE 27th International Conference on Model Driven Engineering Languages and Systems 36 (2024). Pp. 353–363

2024
[33]

& Haugen, Ø

Mertens, J., Klikovits, S., Bordeleau, F., Denil, J. & Haugen, Ø. Continuous evolution of digital twins using the dartwin notation.Software and Systems Modeling(2024)

2024
[34]

Proceedings of the 19th International Symposium on Software Engineering for Adaptive and Self-Managing Systems (2024)

Kamburjan, E.et al.GreenhouseDT: An exemplar for digital twins. Proceedings of the 19th International Symposium on Software Engineering for Adaptive and Self-Managing Systems (2024). Pp. 175–181

2024
[35]

& Wolfert, S

Verdouw, C., Tekinerdogan, B., Beulens, A. & Wolfert, S. Digital twins in smart farming.Agricultural Systems189, 103046 (2021)

2021
[36]

A., Fan, Y

Sisson, S. A., Fan, Y. & Beaumont, M. A. (eds)Handbook of Approximate Bayesian Computation(Chapman and Hall/CRC, 2018)

2018
[37]

S., Nonnenmacher, M

Greenberg, D. S., Nonnenmacher, M. & Macke, J. H. Automatic posterior trans- formation for likelihood-free inference. International Conference on Machine Learning (ICML) (2019). PMLR, pp. 2404–2414

2019
[38]

International Workshop on Digital Twin for Healthcare (2025)

Hoang, T.-D.et al.A real-time digital twin for type 1 diabetes using simulation- based inference. International Workshop on Digital Twin for Healthcare (2025). Pp. 35–46. Springer

2025
[39]

& Hafver, A

Agrell, C., Rognlien Dahl, K. & Hafver, A. Optimal sequential decision making with probabilistic digital twins: Theoretical foundations.SN Applied Sciences5, 114 (2023)

2023
[40]

Jain, M.et al.Gflownets for ai-driven scientific discovery.Digital Discovery2, 557–577 (2023). 37

2023