Interventional Flow Matching: Prospective Dose-Response Forecasting with Velocity-Field Jacobian Regularization

Amirreza Dolatpour Fathkouhi; Heman Shakeri; Justin Lee

arxiv: 2606.29386 · v1 · pith:4TXKAOW2new · submitted 2026-06-28 · 💻 cs.LG

Interventional Flow Matching: Prospective Dose-Response Forecasting with Velocity-Field Jacobian Regularization

Amirreza Dolatpour Fathkouhi , Justin Lee , Heman Shakeri This is my paper

Pith reviewed 2026-06-30 08:10 UTC · model grok-4.3

classification 💻 cs.LG

keywords interventional forecastingflow matchingglucose managementtype 1 diabetesvelocity fieldJacobian regularizationprospective predictiondose response

0 comments

The pith

Interventional Flow Matching forecasts glucose trajectories under planned insulin and carbohydrate sequences by regularizing the velocity field's Jacobian to enforce signed dose-bounded sensitivities.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper addresses prospective forecasting of physiological responses to planned treatments, which differs from standard time-series prediction because future drivers like insulin doses depend on clinical decisions. It introduces Interventional Flow Matching, a flow-matching model in latent glucose space conditioned on patient history and planned future drivers. Instead of embedding mechanistic ODEs or using rollout simulations, it adds a solver-free penalty on the Jacobian of the instantaneous velocity field with respect to smoothed treatment drivers. This penalty directly imposes that insulin lowers glucose and carbohydrates raise it within plausible bounds. On a simulated type 1 diabetes cohort, the approach shows the strongest balance between fitting observed data and producing correct directional and ranked responses to interventions.

Core claim

IFM conditions a continuous-time flow-matching velocity field on patient history and planned future drivers in bounded latent glucose space; penalizing the Jacobian of this velocity field with respect to smoothed treatment drivers imposes signed, dose-bounded local sensitivities so that the learned dynamics produce physiologically correct prospective forecasts without explicit glucose-insulin ODEs or causality-enforcing rollouts.

What carries the argument

Jacobian regularization of the instantaneous velocity field with respect to smoothed treatment drivers, which imposes signed, dose-bounded local sensitivities directly on the learned dynamics.

If this is right

Forecasts respond correctly in sign and magnitude to both insulin (lowering) and carbohydrate (raising) drivers.
The model maintains high directional and ranking consistency across different planned driver levels.
Observational RMSE remains competitive while interventional response metrics improve.
No explicit mechanistic equations or rollout-based causality enforcement are required.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The regularization approach could transfer to other settings where future drivers are policy-dependent, such as dosing in other chronic conditions.
Local sensitivity penalties might substitute for some forms of explicit causal structure in continuous-time generative models.
Real clinical datasets with documented intervention outcomes would provide a direct test of whether the enforced sensitivities align with observed physiology.

Load-bearing premise

Penalizing the Jacobian of the velocity field with respect to smoothed treatment drivers will impose signed, dose-bounded local sensitivities that match real physiology.

What would settle it

A controlled test in which an insulin dose is applied and the model's predicted glucose change has the opposite sign or exceeds known physiological bounds while still fitting observational data.

Figures

Figures reproduced from arXiv: 2606.29386 by Amirreza Dolatpour Fathkouhi, Heman Shakeri, Justin Lee.

**Figure 2.** Figure 2: Physiological smoothing of a single unit impulse at [PITH_FULL_IMAGE:figures/full_fig_p013_2.png] view at source ↗

**Figure 3.** Figure 3: Carbohydrate-bound sweep with insulin bounds fixed at [PITH_FULL_IMAGE:figures/full_fig_p020_3.png] view at source ↗

**Figure 4.** Figure 4: Insulin-bound sweep with carbohydrate bounds fixed at [PITH_FULL_IMAGE:figures/full_fig_p020_4.png] view at source ↗

**Figure 5.** Figure 5: Effect of the skew parameter κ. Increasing κ changes the latent glucose encoding and the decoder-slope scaling of the Jacobian regularizer, leading to lower observed-driver RMSE and higher intervention sensitivity magnitudes. Strict directional and ranking consistency remain stable across the sweep. 21 [PITH_FULL_IMAGE:figures/full_fig_p021_5.png] view at source ↗

**Figure 6.** Figure 6: Example 24-step (2-hour) interventional forecasts from IFM under different planned insulin and [PITH_FULL_IMAGE:figures/full_fig_p022_6.png] view at source ↗

read the original abstract

Predicting a patient's physiological trajectory under a planned treatment sequence is a prospective interventional problem, not standard time-series extrapolation. We study this problem in glucose management, where insulin and carbohydrate records are policy-dependent: future drivers are coupled to patient state, behavior, and clinical decision rules, so observational forecasting accuracy alone does not guarantee correct responses to planned interventions. We introduce Interventional Flow Matching (IFM), a continuous-time generative framework for physiologically constrained prospective forecasting. IFM conditions a flow-matching velocity field on patient history and planned future drivers in a bounded latent glucose space. Rather than embedding strict mechanistic glucose--insulin ODE equations or enforcing causality through rollout-based simulations, IFM uses a solver-free regularization: it penalizes the Jacobian of the instantaneous velocity field with respect to smoothed treatment drivers. This imposes signed, dose-bounded local sensitivities directly on the learned dynamics: insulin lowers glucose, carbohydrates raise it, and both responses remain within plausible ranges. On a simulated UVA/Padova type 1 diabetes cohort, IFM achieves the strongest balance between observed-driver RMSE and interventional response metrics. Across experiments, it consistently produces physiologically correct responses to both insulin and carbohydrate drivers while maintaining high directional, and ranking consistency.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

IFM's Jacobian regularization on the velocity field is a clean way to add interventional constraints without ODEs or rollouts, but the abstract gives no evidence that local penalties produce correct integrated trajectories over multi-hour horizons.

read the letter

The core idea is to condition a flow-matching model on planned future insulin and carb inputs, then add a solver-free penalty on the Jacobian of the velocity field with respect to those drivers. This is meant to force the learned dynamics to respect signed, bounded sensitivities (insulin down, carbs up) directly in the continuous-time generative process. That combination is new in this setting and sidesteps both embedding full glucose-insulin ODEs and running expensive rollout simulations.

The paper does a reasonable job stating the distinction between observational forecasting and prospective interventional prediction, and the UVA/Padova simulator experiments are the right testbed for this kind of claim. The regularization is lightweight and avoids some of the usual machinery in causal time-series work.

The main weakness is the one the stress-test note flags. A first-order Jacobian penalty at each instant does not automatically control the cumulative effect of the flow ODE over several hours, especially once the drivers become policy-dependent and are no longer smoothed. The abstract claims physiologically correct responses and good directional consistency, but it supplies no numbers, no ablation of the Jacobian term, and no explicit verification that the generated trajectories match the simulator's integrated response rather than just local correlations. Without those checks it is impossible to know whether the regularization is doing the claimed work or whether the model is simply fitting the simulator in other ways.

This is for people working on generative models for clinical time series who want to move past pure observational accuracy. A reader already familiar with flow matching or continuous normalizing flows will see the technical move clearly. The work is coherent on its own terms and engages the right literature, so it is worth sending to referees even though the current evidence is thin. Ask for the missing quantitative results and an explicit integrated-response check before deciding on acceptance.

Referee Report

3 major / 2 minor

Summary. The paper introduces Interventional Flow Matching (IFM), a continuous-time generative framework for prospective dose-response forecasting in glucose management. It conditions a flow-matching velocity field on patient history and planned future drivers in a bounded latent space, using a solver-free Jacobian penalty on the instantaneous velocity field w.r.t. smoothed treatment drivers to enforce signed, dose-bounded local sensitivities (insulin lowers glucose, carbohydrates raise it) without explicit mechanistic ODEs or rollout simulations. On a simulated UVA/Padova type 1 diabetes cohort, IFM is reported to achieve the strongest balance between observed-driver RMSE and interventional response metrics while producing physiologically correct responses with high directional and ranking consistency.

Significance. If the central claim holds, the work offers a practical alternative to mechanistic modeling or simulation-based causality enforcement for interventional forecasting tasks where future drivers are policy-dependent. The solver-free regularization approach could generalize to other domains requiring constrained generative dynamics, and the emphasis on prospective rather than observational metrics addresses a key gap in time-series modeling for healthcare.

major comments (3)

[Abstract and §3 (Method)] The abstract and method description claim that the Jacobian regularization 'imposes signed, dose-bounded local sensitivities directly on the learned dynamics' whose integrated trajectories match simulator physiology, but no explicit verification is provided that local first-order constraints control cumulative effects over multi-hour forecast horizons (as opposed to only instantaneous correlations). This is load-bearing for the prospective interventional claim.
[§4 (Experiments)] The reported experimental results on the UVA/Padova cohort state superior balance on interventional metrics and physiologically correct responses, yet the provided description contains no quantitative values, error bars, ablation details on the Jacobian term, or direct comparison of integrated trajectories under planned interventions versus simulator ground truth. Without these, the central empirical claim cannot be assessed.
[§3.2 (Regularization) and §4.3 (Interventional evaluation)] The regularization is applied to smoothed drivers, but the paper does not address how the learned dynamics behave when drivers are policy-dependent and non-smoothed during actual prospective use; this gap risks drift in the integrated flow-matching ODE over horizons where the assumption of smoothing no longer holds.

minor comments (2)

[Abstract] Notation for the velocity field and Jacobian is introduced without an explicit equation reference in the abstract; adding a pointer to the defining equation would improve readability.
[Abstract and §4] The description of 'high directional and ranking consistency' would benefit from a precise definition or reference to the metric used.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their constructive comments on our manuscript. We respond to each major comment below and indicate the revisions we will make to strengthen the presentation of the interventional claims.

read point-by-point responses

Referee: [Abstract and §3 (Method)] The abstract and method description claim that the Jacobian regularization 'imposes signed, dose-bounded local sensitivities directly on the learned dynamics' whose integrated trajectories match simulator physiology, but no explicit verification is provided that local first-order constraints control cumulative effects over multi-hour forecast horizons (as opposed to only instantaneous correlations). This is load-bearing for the prospective interventional claim.

Authors: We agree that an explicit link between the instantaneous Jacobian constraints and cumulative multi-hour effects would strengthen the central claim. While the experiments already evaluate integrated trajectories under interventions, we will add a new subsection in the revision that directly verifies propagation of the local signed sensitivities to cumulative glucose changes over 4-6 hour horizons, with quantitative comparisons to simulator ground truth. revision: yes
Referee: [§4 (Experiments)] The reported experimental results on the UVA/Padova cohort state superior balance on interventional metrics and physiologically correct responses, yet the provided description contains no quantitative values, error bars, ablation details on the Jacobian term, or direct comparison of integrated trajectories under planned interventions versus simulator ground truth. Without these, the central empirical claim cannot be assessed.

Authors: The full §4 contains tables and figures reporting the quantitative interventional metrics, RMSE values, directional consistency scores, and error bars across the cohort. To make these immediately assessable, we will expand the text in the revision to quote the key numerical results, include an explicit Jacobian ablation table, and add trajectory comparison plots against simulator ground truth for planned interventions. revision: yes
Referee: [§3.2 (Regularization) and §4.3 (Interventional evaluation)] The regularization is applied to smoothed drivers, but the paper does not address how the learned dynamics behave when drivers are policy-dependent and non-smoothed during actual prospective use; this gap risks drift in the integrated flow-matching ODE over horizons where the assumption of smoothing no longer holds.

Authors: Smoothing is used solely during training to obtain reliable Jacobian estimates. At inference the velocity field is integrated directly with the (non-smoothed) planned drivers. We will add a paragraph in §3.2 and supporting experiments in §4.3 showing that the enforced local constraints remain effective and limit drift for non-smoothed inputs over the forecast horizons used in the study. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation self-contained against simulator benchmarks

full rationale

The paper defines IFM via a Jacobian penalty on the velocity field w.r.t. smoothed drivers to impose local signed sensitivities, then validates prospective interventional forecasts against independent UVA/Padova simulator ground truth using RMSE, directional, and ranking metrics. No equation or claim reduces the reported interventional response metrics to the regularization parameters by construction; the evaluation uses external simulator trajectories rather than quantities defined from the Jacobian term itself. No self-citation load-bearing steps, uniqueness theorems, or ansatz smuggling appear in the abstract or description. The central claim therefore retains independent empirical content.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim depends on the domain assumption that the Jacobian penalty produces physiologically correct signed responses; no free parameters or invented entities are mentioned in the abstract.

axioms (1)

domain assumption Penalizing the Jacobian of the velocity field with respect to smoothed treatment drivers imposes signed, dose-bounded local sensitivities matching real physiology.
Directly stated as the mechanism that replaces explicit ODEs or rollout simulations.

pith-pipeline@v0.9.1-grok · 5755 in / 1287 out tokens · 39174 ms · 2026-06-30T08:10:30.349086+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

33 extracted references · 4 canonical work pages · 2 internal anchors

[1]

Graph network simulators can learn discontinuous, rigid contact dynamics

Kelsey R Allen, Tatiana Lopez Guevara, Yulia Rubanova, Kim Stachenfeld, Alvaro Sanchez- Gonzalez, Peter Battaglia, and Tobias Pfaff. Graph network simulators can learn discontinuous, rigid contact dynamics. InConference on Robot Learning, pages 1157–1167. PMLR, 2023

2023
[2]

Causal Regularization

Mohammad Taha Bahadori, Krzysztof Chalupka, Edward Choi, Robert Chen, Walter F Stewart, and Jimeng Sun. Causal regularization.arXiv preprint arXiv:1702.02604, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[3]

Alaa, James Jordon, and Mihaela van der Schaar

Ioana Bica, Ahmed M. Alaa, James Jordon, and Mihaela van der Schaar. Estimating coun- terfactual treatment outcomes over time through adversarially balanced representations. In International Conference on Learning Representations, 2020

2020
[4]

Physical activity into the meal glucose—insulin model of type 1 diabetes: In silico studies, 2009

Chiara Dalla Man, Marc D Breton, and Claudio Cobelli. Physical activity into the meal glucose—insulin model of type 1 diabetes: In silico studies, 2009

2009
[5]

Hongwei Dong, Fangzhou Han, Lingyu Si, Wenwen Qiang, Ruiheng Zhang, and Lamei Zhang. Background debiased SAR automatic target recognition via a novel causal interventional regularizer.IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 17:16993–17006, 2024

2024
[6]

The stationarity bias: Stratified stress-testing for time-series imputation in regulated dynamical systems.arXiv preprint arXiv:2602.15637, 2026

Amirreza Dolatpour Fathkouhi, Alireza Namazi, and Heman Shakeri. The stationarity bias: Stratified stress-testing for time-series imputation in regulated dynamical systems.arXiv preprint arXiv:2602.15637, 2026

work page arXiv 2026
[7]

Mamba: Linear-Time Sequence Modeling with Selective State Spaces

Albert Gu and Tri Dao. Mamba: Linear-time sequence modeling with selective state spaces. arXiv preprint arXiv:2312.00752, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[8]

On the parameterization and initialization of diagonal state space models.Advances in Neural Information Processing Systems, 35:35971–35983, 2022

Albert Gu, Karan Goel, Ankit Gupta, and Christopher Ré. On the parameterization and initialization of diagonal state space models.Advances in Neural Information Processing Systems, 35:35971–35983, 2022

2022
[9]

Nonlinear model predictive control of glucose concentration in subjects with type 1 diabetes.Physiological Measurement, 25(4):905–920, 2004

Roman Hovorka, Valentina Canonico, Ludovic J Chassin, Ulrich Haueter, Massimo Massi- Benedetti, Marco Orsini Federici, Thomas R Pieber, Helga C Schaller, Lukas Schaupp, Thomas Vering, et al. Nonlinear model predictive control of glucose concentration in subjects with type 1 diabetes.Physiological Measurement, 25(4):905–920, 2004

2004
[10]

Temporal convolutional networks: A unified approach to action segmentation

Colin Lea, Rene Vidal, Austin Reiter, and Gregory D Hager. Temporal convolutional networks: A unified approach to action segmentation. InEuropean Conference on Computer Vision, pages 47–54. Springer, 2016. 10

2016
[11]

Short- comings in the evaluation of blood glucose forecasting.IEEE Transactions on Biomedical Engineering, 71(12):3424–3431, 2024

Jung Min Lee, Rodica Pop-Busui, Joyce M Lee, Jesper Fleischer, and Jenna Wiens. Short- comings in the evaluation of blood glucose forecasting.IEEE Transactions on Biomedical Engineering, 71(12):3424–3431, 2024

2024
[12]

Sow, Piyush Madan, Jun Li, Mohamed Ghalwash, Zach Shahn, and Li-wei Lehman

Rui Li, Stephanie Hu, Mingyu Lu, Yuria Utsumi, Prithwish Chakraborty, Daby M. Sow, Piyush Madan, Jun Li, Mohamed Ghalwash, Zach Shahn, and Li-wei Lehman. G-net: A recurrent network approach to G-computation for counterfactual prediction under a dynamic treatment regime. In Subhrajit Roy, Stephen Pfohl, Emma Rocheteau, Girmaw Abebe Tadesse, Luis Oala, Fabi...

2021
[13]

Graph neural networks accelerated molecular dynamics.The Journal of Chemical Physics, 156(14), 2022

Zijie Li, Kazem Meidani, Prakarsh Yadav, and Amir Barati Farimani. Graph neural networks accelerated molecular dynamics.The Journal of Chemical Physics, 156(14), 2022

2022
[14]

Flow match- ing for generative modeling

Yaron Lipman, Ricky TQ Chen, Heli Ben-Hamu, Maximilian Nickel, and Matt Le. Flow match- ing for generative modeling. In11th International Conference on Learning Representations, ICLR 2023, 2023

2023
[15]

Decoupled weight decay regularization

Ilya Loshchilov and Frank Hutter. Decoupled weight decay regularization. InInternational Conference on Learning Representations, 2017

2017
[16]

The UV A/PADOV A type 1 diabetes simulator: new features.Journal of Diabetes Science and Technology, 8(1):26–34, 2014

Chiara Dalla Man, Francesco Micheletto, Dayu Lv, Marc Breton, Boris Kovatchev, and Claudio Cobelli. The UV A/PADOV A type 1 diabetes simulator: new features.Journal of Diabetes Science and Technology, 8(1):26–34, 2014

2014
[17]

Causal transformer for estimating counterfactual outcomes

Valentyn Melnychuk, Dennis Frauen, and Stefan Feuerriegel. Causal transformer for estimating counterfactual outcomes. In Kamalika Chaudhuri, Stefanie Jegelka, Le Song, Csaba Szepesvari, Gang Niu, and Sivan Sabato, editors,Proceedings of the 39th International Conference on Machine Learning, volume 162 ofProceedings of Machine Learning Research, pages 1529...

2022
[18]

Learning insulin-glucose dynamics in the wild

Andrew C Miller, Nicholas J Foti, and Emily Fox. Learning insulin-glucose dynamics in the wild. InMachine Learning for Healthcare Conference, pages 172–197. PMLR, 2020

2020
[19]

VCNet and functional targeted regularization for learning causal effects of continuous treatments

Lizhen Nie, Mao Ye, Qiang Liu, and Dan Nicolae. VCNet and functional targeted regularization for learning causal effects of continuous treatments. InInternational Conference on Learning Representations, 2021

2021
[20]

Cambridge University Press, 2009

Judea Pearl.Causality. Cambridge University Press, 2009

2009
[21]

Learning mesh- based simulation with graph networks

Tobias Pfaff, Meire Fortunato, Alvaro Sanchez-Gonzalez, and Peter W Battaglia. Learning mesh- based simulation with graph networks. InInternational Conference on Learning Representations, 2021

2021
[22]

Graph-conditional flow matching for relational data generation

Davide Scassola, Sebastiano Saccani, and Luca Bortolussi. Graph-conditional flow matching for relational data generation. InProceedings of the AAAI Conference on Artificial Intelligence, volume 40, pages 25209–25217, 2026

2026
[23]

The driver-blindness phenomenon: Why deep sequence models default to autocorrelation in blood glucose forecasting.arXiv preprint arXiv:2511.20601, 2025

Heman Shakeri. The driver-blindness phenomenon: Why deep sequence models default to autocorrelation in blood glucose forecasting.arXiv preprint arXiv:2511.20601, 2025

work page arXiv 2025
[24]

Conditional flow matching for time series modelling

Ella Tamir, Najwa Laabid, Markus Heinonen, Vikas Garg, and Arno Solin. Conditional flow matching for time series modelling. InICML 2024 Workshop on Structured Probabilistic Inference & Generative Modeling, 2024

2024
[25]

Fourier features let networks learn high frequency functions in low dimensional domains.Advances in Neural Information Processing Systems, 33:7537–7547, 2020

Matthew Tancik, Pratul Srinivasan, Ben Mildenhall, Sara Fridovich-Keil, Nithin Raghavan, Utkarsh Singhal, Ravi Ramamoorthi, Jonathan Barron, and Ren Ng. Fourier features let networks learn high frequency functions in low dimensional domains.Advances in Neural Information Processing Systems, 33:7537–7547, 2020. 11

2020
[26]

Doflow: Flow-based generative models for interventional and counterfactual forecasting on time series

Dongze Wu, Feng Qiu, and Yao Xie. Doflow: Flow-based generative models for interventional and counterfactual forecasting on time series. InThe 14th International Conference on Learning Representations, 2025

2025
[27]

Learning large-scale subsurface simulations with a hybrid graph network simulator

Tailin Wu, Qinchen Wang, Yinan Zhang, Rex Ying, Kaidi Cao, Rok Sosic, Ridwan Jalali, Hassan Hamam, Marko Maucec, and Jure Leskovec. Learning large-scale subsurface simulations with a hybrid graph network simulator. InProceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 4184–4194, 2022

2022
[28]

Neural causal models for counterfactual identification and estimation

Kevin Muyuan Xia, Yushu Pan, and Elias Bareinboim. Neural causal models for counterfactual identification and estimation. In11th International Conference on Learning Representations, ICLR 2023, 2023

2023
[29]

Root mean square layer normalization.Advances in Neural Information Processing Systems, 32, 2019

Biao Zhang and Rico Sennrich. Root mean square layer normalization.Advances in Neural Information Processing Systems, 32, 2019

2019
[30]

Dim-gestor: Co-speech gesture generation with adaptive layer normalization mamba-2

Fan Zhang, Yi Wei, Naye Ji, Jingmei Wu, Zhaohan Wang, Fuxing Gao, Liuqing Zhang, Zhenqing Ye, Leyao Yan, Lanxin Dai, et al. Dim-gestor: Co-speech gesture generation with adaptive layer normalization mamba-2. In2025 International Conference on Computer Vision, Image Processing and Computational Photography (CVIP), pages 01–13. IEEE, 2025

2025
[31]

Hybrid2 neural ode causal modeling and an application to glycemic response

Bob Junyi Zou, Matthew E Levine, Dessi P Zaharieva, Ramesh Johari, and Emily Fox. Hybrid2 neural ode causal modeling and an application to glycemic response. InInternational Conference on Machine Learning, pages 62934–62963. PMLR, 2024

2024
[32]

Limitations

Bob Junyi Zou and Lu Tian. Automatic and structure-aware sparsification of hybrid neural ODEs with application to glucose prediction. InThe 14th International Conference on Learning Representations, 2026. A Architecture Details A.1 Encoder Architecture The encoder receivesX 1:L ∈R B×L×din and projects each time step to the model dimension: R(0) 1:L = Line...

2026
[33]

Justification: The paper does not involve crowdsourcing or research with human subjects; experiments use simulated glucose-management data

Institutional review board (IRB) approvals or equivalent for research with human subjects 28 Question: Does the paper describe potential risks incurred by study participants, whether such risks were disclosed to the subjects, and whether Institutional Review Board (IRB) approvals (or an equivalent approval/review based on the requirements of your country ...

[1] [1]

Graph network simulators can learn discontinuous, rigid contact dynamics

Kelsey R Allen, Tatiana Lopez Guevara, Yulia Rubanova, Kim Stachenfeld, Alvaro Sanchez- Gonzalez, Peter Battaglia, and Tobias Pfaff. Graph network simulators can learn discontinuous, rigid contact dynamics. InConference on Robot Learning, pages 1157–1167. PMLR, 2023

2023

[2] [2]

Causal Regularization

Mohammad Taha Bahadori, Krzysztof Chalupka, Edward Choi, Robert Chen, Walter F Stewart, and Jimeng Sun. Causal regularization.arXiv preprint arXiv:1702.02604, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017

[3] [3]

Alaa, James Jordon, and Mihaela van der Schaar

Ioana Bica, Ahmed M. Alaa, James Jordon, and Mihaela van der Schaar. Estimating coun- terfactual treatment outcomes over time through adversarially balanced representations. In International Conference on Learning Representations, 2020

2020

[4] [4]

Physical activity into the meal glucose—insulin model of type 1 diabetes: In silico studies, 2009

Chiara Dalla Man, Marc D Breton, and Claudio Cobelli. Physical activity into the meal glucose—insulin model of type 1 diabetes: In silico studies, 2009

2009

[5] [5]

Hongwei Dong, Fangzhou Han, Lingyu Si, Wenwen Qiang, Ruiheng Zhang, and Lamei Zhang. Background debiased SAR automatic target recognition via a novel causal interventional regularizer.IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 17:16993–17006, 2024

2024

[6] [6]

The stationarity bias: Stratified stress-testing for time-series imputation in regulated dynamical systems.arXiv preprint arXiv:2602.15637, 2026

Amirreza Dolatpour Fathkouhi, Alireza Namazi, and Heman Shakeri. The stationarity bias: Stratified stress-testing for time-series imputation in regulated dynamical systems.arXiv preprint arXiv:2602.15637, 2026

work page arXiv 2026

[7] [7]

Mamba: Linear-Time Sequence Modeling with Selective State Spaces

Albert Gu and Tri Dao. Mamba: Linear-time sequence modeling with selective state spaces. arXiv preprint arXiv:2312.00752, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023

[8] [8]

On the parameterization and initialization of diagonal state space models.Advances in Neural Information Processing Systems, 35:35971–35983, 2022

Albert Gu, Karan Goel, Ankit Gupta, and Christopher Ré. On the parameterization and initialization of diagonal state space models.Advances in Neural Information Processing Systems, 35:35971–35983, 2022

2022

[9] [9]

Nonlinear model predictive control of glucose concentration in subjects with type 1 diabetes.Physiological Measurement, 25(4):905–920, 2004

Roman Hovorka, Valentina Canonico, Ludovic J Chassin, Ulrich Haueter, Massimo Massi- Benedetti, Marco Orsini Federici, Thomas R Pieber, Helga C Schaller, Lukas Schaupp, Thomas Vering, et al. Nonlinear model predictive control of glucose concentration in subjects with type 1 diabetes.Physiological Measurement, 25(4):905–920, 2004

2004

[10] [10]

Temporal convolutional networks: A unified approach to action segmentation

Colin Lea, Rene Vidal, Austin Reiter, and Gregory D Hager. Temporal convolutional networks: A unified approach to action segmentation. InEuropean Conference on Computer Vision, pages 47–54. Springer, 2016. 10

2016

[11] [11]

Short- comings in the evaluation of blood glucose forecasting.IEEE Transactions on Biomedical Engineering, 71(12):3424–3431, 2024

Jung Min Lee, Rodica Pop-Busui, Joyce M Lee, Jesper Fleischer, and Jenna Wiens. Short- comings in the evaluation of blood glucose forecasting.IEEE Transactions on Biomedical Engineering, 71(12):3424–3431, 2024

2024

[12] [12]

Sow, Piyush Madan, Jun Li, Mohamed Ghalwash, Zach Shahn, and Li-wei Lehman

Rui Li, Stephanie Hu, Mingyu Lu, Yuria Utsumi, Prithwish Chakraborty, Daby M. Sow, Piyush Madan, Jun Li, Mohamed Ghalwash, Zach Shahn, and Li-wei Lehman. G-net: A recurrent network approach to G-computation for counterfactual prediction under a dynamic treatment regime. In Subhrajit Roy, Stephen Pfohl, Emma Rocheteau, Girmaw Abebe Tadesse, Luis Oala, Fabi...

2021

[13] [13]

Graph neural networks accelerated molecular dynamics.The Journal of Chemical Physics, 156(14), 2022

Zijie Li, Kazem Meidani, Prakarsh Yadav, and Amir Barati Farimani. Graph neural networks accelerated molecular dynamics.The Journal of Chemical Physics, 156(14), 2022

2022

[14] [14]

Flow match- ing for generative modeling

Yaron Lipman, Ricky TQ Chen, Heli Ben-Hamu, Maximilian Nickel, and Matt Le. Flow match- ing for generative modeling. In11th International Conference on Learning Representations, ICLR 2023, 2023

2023

[15] [15]

Decoupled weight decay regularization

Ilya Loshchilov and Frank Hutter. Decoupled weight decay regularization. InInternational Conference on Learning Representations, 2017

2017

[16] [16]

The UV A/PADOV A type 1 diabetes simulator: new features.Journal of Diabetes Science and Technology, 8(1):26–34, 2014

Chiara Dalla Man, Francesco Micheletto, Dayu Lv, Marc Breton, Boris Kovatchev, and Claudio Cobelli. The UV A/PADOV A type 1 diabetes simulator: new features.Journal of Diabetes Science and Technology, 8(1):26–34, 2014

2014

[17] [17]

Causal transformer for estimating counterfactual outcomes

Valentyn Melnychuk, Dennis Frauen, and Stefan Feuerriegel. Causal transformer for estimating counterfactual outcomes. In Kamalika Chaudhuri, Stefanie Jegelka, Le Song, Csaba Szepesvari, Gang Niu, and Sivan Sabato, editors,Proceedings of the 39th International Conference on Machine Learning, volume 162 ofProceedings of Machine Learning Research, pages 1529...

2022

[18] [18]

Learning insulin-glucose dynamics in the wild

Andrew C Miller, Nicholas J Foti, and Emily Fox. Learning insulin-glucose dynamics in the wild. InMachine Learning for Healthcare Conference, pages 172–197. PMLR, 2020

2020

[19] [19]

VCNet and functional targeted regularization for learning causal effects of continuous treatments

Lizhen Nie, Mao Ye, Qiang Liu, and Dan Nicolae. VCNet and functional targeted regularization for learning causal effects of continuous treatments. InInternational Conference on Learning Representations, 2021

2021

[20] [20]

Cambridge University Press, 2009

Judea Pearl.Causality. Cambridge University Press, 2009

2009

[21] [21]

Learning mesh- based simulation with graph networks

Tobias Pfaff, Meire Fortunato, Alvaro Sanchez-Gonzalez, and Peter W Battaglia. Learning mesh- based simulation with graph networks. InInternational Conference on Learning Representations, 2021

2021

[22] [22]

Graph-conditional flow matching for relational data generation

Davide Scassola, Sebastiano Saccani, and Luca Bortolussi. Graph-conditional flow matching for relational data generation. InProceedings of the AAAI Conference on Artificial Intelligence, volume 40, pages 25209–25217, 2026

2026

[23] [23]

The driver-blindness phenomenon: Why deep sequence models default to autocorrelation in blood glucose forecasting.arXiv preprint arXiv:2511.20601, 2025

Heman Shakeri. The driver-blindness phenomenon: Why deep sequence models default to autocorrelation in blood glucose forecasting.arXiv preprint arXiv:2511.20601, 2025

work page arXiv 2025

[24] [24]

Conditional flow matching for time series modelling

Ella Tamir, Najwa Laabid, Markus Heinonen, Vikas Garg, and Arno Solin. Conditional flow matching for time series modelling. InICML 2024 Workshop on Structured Probabilistic Inference & Generative Modeling, 2024

2024

[25] [25]

Fourier features let networks learn high frequency functions in low dimensional domains.Advances in Neural Information Processing Systems, 33:7537–7547, 2020

Matthew Tancik, Pratul Srinivasan, Ben Mildenhall, Sara Fridovich-Keil, Nithin Raghavan, Utkarsh Singhal, Ravi Ramamoorthi, Jonathan Barron, and Ren Ng. Fourier features let networks learn high frequency functions in low dimensional domains.Advances in Neural Information Processing Systems, 33:7537–7547, 2020. 11

2020

[26] [26]

Doflow: Flow-based generative models for interventional and counterfactual forecasting on time series

Dongze Wu, Feng Qiu, and Yao Xie. Doflow: Flow-based generative models for interventional and counterfactual forecasting on time series. InThe 14th International Conference on Learning Representations, 2025

2025

[27] [27]

Learning large-scale subsurface simulations with a hybrid graph network simulator

Tailin Wu, Qinchen Wang, Yinan Zhang, Rex Ying, Kaidi Cao, Rok Sosic, Ridwan Jalali, Hassan Hamam, Marko Maucec, and Jure Leskovec. Learning large-scale subsurface simulations with a hybrid graph network simulator. InProceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 4184–4194, 2022

2022

[28] [28]

Neural causal models for counterfactual identification and estimation

Kevin Muyuan Xia, Yushu Pan, and Elias Bareinboim. Neural causal models for counterfactual identification and estimation. In11th International Conference on Learning Representations, ICLR 2023, 2023

2023

[29] [29]

Root mean square layer normalization.Advances in Neural Information Processing Systems, 32, 2019

Biao Zhang and Rico Sennrich. Root mean square layer normalization.Advances in Neural Information Processing Systems, 32, 2019

2019

[30] [30]

Dim-gestor: Co-speech gesture generation with adaptive layer normalization mamba-2

Fan Zhang, Yi Wei, Naye Ji, Jingmei Wu, Zhaohan Wang, Fuxing Gao, Liuqing Zhang, Zhenqing Ye, Leyao Yan, Lanxin Dai, et al. Dim-gestor: Co-speech gesture generation with adaptive layer normalization mamba-2. In2025 International Conference on Computer Vision, Image Processing and Computational Photography (CVIP), pages 01–13. IEEE, 2025

2025

[31] [31]

Hybrid2 neural ode causal modeling and an application to glycemic response

Bob Junyi Zou, Matthew E Levine, Dessi P Zaharieva, Ramesh Johari, and Emily Fox. Hybrid2 neural ode causal modeling and an application to glycemic response. InInternational Conference on Machine Learning, pages 62934–62963. PMLR, 2024

2024

[32] [32]

Limitations

Bob Junyi Zou and Lu Tian. Automatic and structure-aware sparsification of hybrid neural ODEs with application to glucose prediction. InThe 14th International Conference on Learning Representations, 2026. A Architecture Details A.1 Encoder Architecture The encoder receivesX 1:L ∈R B×L×din and projects each time step to the model dimension: R(0) 1:L = Line...

2026

[33] [33]

Justification: The paper does not involve crowdsourcing or research with human subjects; experiments use simulated glucose-management data

Institutional review board (IRB) approvals or equivalent for research with human subjects 28 Question: Does the paper describe potential risks incurred by study participants, whether such risks were disclosed to the subjects, and whether Institutional Review Board (IRB) approvals (or an equivalent approval/review based on the requirements of your country ...