pith. machine review for the scientific record. sign in

arxiv: 2605.07485 · v1 · submitted 2026-05-08 · 💻 cs.LG · cs.AI

Recognition: 2 theorem links

· Lean Theorem

Excluding the Target Domain Improves Extrapolation: Deconfounded Hierarchical Physics Constraints

Tsuyoshi Okita

Authors on Pith no claims yet

Pith reviewed 2026-05-11 01:47 UTC · model grok-4.3

classification 💻 cs.LG cs.AI
keywords extrapolationphysics-constrained modelsdeconfoundinghierarchical constraintsout-of-distribution generalizationbattery temperatureFourier neural operators
0
0 comments X

The pith

Excluding target-domain data from pretraining improves extrapolation by 39 percent in physics-constrained models.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper seeks to solve poor extrapolation in physics-constrained generative models when test conditions differ from training data. It introduces a gate that detects and removes temperature confounding at successive levels of physical rules before enforcing those rules from coarse to fine. A central result is that withholding the target temperature range during pretraining lets the model learn more general physical patterns, cutting error compared with including that data. This matters for tasks such as forecasting battery temperatures across wide environmental ranges where new conditions appear at deployment. If the approach holds, models could maintain accuracy when physical laws interact with shifting external variables without retraining on every new scenario.

Core claim

The Deconfounded Hierarchical Gate identifies when temperature confounding affects each physical constraint level through counterfactual estimation with the do-operator and backdoor adjustment, then enforces constraints progressively from coarse to fine. Pretraining without target-domain data yields RMSE of 0.224 versus 0.324 when target data is included, a 39 percent gain in extrapolation; on the lithium-ion battery benchmark trained at 24 degrees Celsius and tested at 4 to 43 degrees Celsius the method reaches RMSE 0.215, a 46 percent improvement over the unconstrained baseline of 0.397.

What carries the argument

The Deconfounded Hierarchical Gate (DHG), a mechanism that combines do-operator counterfactual estimation and backdoor adjustment to isolate intrinsic physical inconsistency from temperature confounding before applying hierarchical constraints progressively.

If this is right

  • Hierarchical constraints applied progressively outperform a single static regularization term across the generation process.
  • Fourier Neural Operators capture domain-agnostic physical patterns more effectively when target-domain examples are withheld from pretraining.
  • Backdoor adjustment at each constraint level isolates genuine physical violations from spurious temperature effects.
  • The method delivers RMSE of 0.215 on the battery temperature extrapolation task versus 0.397 for the unconstrained baseline.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same exclusion strategy during pretraining could apply to other generative models facing distribution shifts driven by measurable external variables.
  • Deconfounding at multiple hierarchy levels might prove useful in non-battery physics domains where similar confounding structures appear.
  • Testing the gate on tasks without an obvious single confounder would clarify how much the temperature-specific adjustment contributes to the overall gain.

Load-bearing premise

Temperature is the main confounder that can be removed via do-operator and backdoor adjustment without introducing new inconsistencies into the enforcement of physical laws.

What would settle it

An experiment that includes target-domain temperature data in pretraining and obtains equal or lower extrapolation RMSE than the version that excludes it would contradict the reported pretraining benefit.

Figures

Figures reproduced from arXiv: 2605.07485 by Tsuyoshi Okita.

Figure 1
Figure 1. Figure 1: Overview of the HPC-FNO-CFM framework. Left: FNO(1) pretrained on multi-condition battery data (Stage 1) using spectral convolution to learn condition-dependent physical patterns; parameters are frozen after Stage 1. Center: Condition-conditioned CFM generation network (Stage 2). The frozen FNO(1) provides physical guidance to the CFM velocity field through an Integration Layer, analogous to PDE guidance a… view at source ↗
read the original abstract

Extrapolation to out-of-distribution conditions is a fundamental challenge for physics-constrained deep generative models. Existing methods apply physical constraints as a single static regularization term uniformly across the generation process, and address neither the hierarchical structure of physical laws and the confounding variable problem. We propose the Deconfounded Hierarchical Gate (DHG), which serves as a diagnostic and control mechanism: it identifies when and how strongly temperature confounding contaminates each constraint level, so that hierarchical gates reflect intrinsic physical inconsistency rather than spurious temperature effects. DHG combines counterfactual estimation via the do-operator with backdoor adjustment to remove confounding, then applies Coarse-to-Fine physical constraints progressively. We report a counter-intuitive finding in pretraining: excluding the target-domain data from pretraining outperforms including it by 39% in extrapolation performance (RMSE 0.224 vs. 0.324). This occurs because FNO learns domain-agnostic physical patterns that transfer more effectively when the target domain is withheld. On a lithium-ion battery temperature extrapolation benchmark (trained at 24 degrees Celsius, evaluated at 4.0--43.0 degrees Celsius), our method achieves RMSE = 0.215, a 46% improvement over the unconstrained baseline (Pure CFM: 0.397).

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper proposes the Deconfounded Hierarchical Gate (DHG) for physics-constrained deep generative models. DHG uses the do-operator and backdoor adjustment to identify and remove temperature confounding at each level of a hierarchy of physical constraints, enabling better extrapolation. A key empirical claim is that excluding target-domain data from pretraining improves extrapolation performance by 39% (RMSE 0.224 vs. 0.324). On a lithium-ion battery temperature extrapolation benchmark (train at 24°C, evaluate at 4–43°C), DHG achieves RMSE 0.215, a 46% improvement over the unconstrained Pure CFM baseline (0.397).

Significance. If the causal graph is correctly specified and the reported gains are robust to alternative adjustment sets and data splits, the work could meaningfully advance physics-informed generative modeling by separating intrinsic physical violations from confounding effects. The counter-intuitive pretraining result, if reproducible, would also challenge standard practice in domain-adaptive scientific ML.

major comments (3)
  1. [Abstract] Abstract: the 39% and 46% RMSE gains are stated without error bars, statistical tests, ablation tables, or any description of how the causal graph or adjustment set for temperature was chosen and validated. Because the entire deconfounding claim rests on backdoor adjustment being valid, the absence of this evidence is load-bearing for the central performance claims.
  2. [Method] Method description: backdoor adjustment is invoked to isolate temperature confounding at each hierarchical constraint level, yet no causal graph, no list of observed covariates (current, SOC, voltage, etc.), and no sensitivity check to alternative graphs are provided. If the graph is misspecified, the adjustment can leave residual confounding or introduce new bias, directly undermining the assertion that the gates reflect 'intrinsic physical inconsistency rather than spurious temperature effects.'
  3. [Results] Results section: the claim that 'FNO learns domain-agnostic physical patterns' when target data are withheld is presented as an explanation for the 39% gain, but no supporting analysis (e.g., feature visualizations, domain-invariance metrics, or controlled ablations that isolate the exclusion effect from the DHG component) is referenced.
minor comments (2)
  1. Notation for the hierarchical gates and the progressive Coarse-to-Fine loss terms should be introduced with explicit equations rather than high-level prose.
  2. [Abstract] The abstract would be clearer if it briefly named the other baselines beyond 'Pure CFM' and stated the number of random seeds or cross-validation folds used for the reported RMSE values.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their constructive comments, which have helped clarify several aspects of our work. We address each major comment below and will revise the manuscript accordingly.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the 39% and 46% RMSE gains are stated without error bars, statistical tests, ablation tables, or any description of how the causal graph or adjustment set for temperature was chosen and validated. Because the entire deconfounding claim rests on backdoor adjustment being valid, the absence of this evidence is load-bearing for the central performance claims.

    Authors: We agree that the abstract would be strengthened by including statistical context for the reported gains. In the revised manuscript, we will add error bars to the RMSE figures and reference the statistical tests from the results section. We will also include a brief note on the adjustment set (current, SOC, voltage) selected from domain knowledge of battery thermal dynamics, with full details and ablations directed to the method and supplementary sections. revision: yes

  2. Referee: [Method] Method description: backdoor adjustment is invoked to isolate temperature confounding at each hierarchical constraint level, yet no causal graph, no list of observed covariates (current, SOC, voltage, etc.), and no sensitivity check to alternative graphs are provided. If the graph is misspecified, the adjustment can leave residual confounding or introduce new bias, directly undermining the assertion that the gates reflect 'intrinsic physical inconsistency rather than spurious temperature effects.'

    Authors: We thank the referee for this observation. The method section describes the do-operator and backdoor adjustment but lacks an explicit causal graph and covariate list. We will add a figure showing the causal graph with temperature as confounder and observed variables (current, SOC, voltage). A sensitivity analysis to alternative adjustment sets will be added to the supplementary material to demonstrate robustness and confirm that the gates primarily capture intrinsic physical inconsistencies. revision: yes

  3. Referee: [Results] Results section: the claim that 'FNO learns domain-agnostic physical patterns' when target data are withheld is presented as an explanation for the 39% gain, but no supporting analysis (e.g., feature visualizations, domain-invariance metrics, or controlled ablations that isolate the exclusion effect from the DHG component) is referenced.

    Authors: We acknowledge that additional analysis would better support the explanation for the pretraining result. In the revised results section, we will include feature visualizations of FNO representations and domain-invariance metrics (e.g., MMD) comparing pretraining regimes. Controlled ablations isolating the data-exclusion effect from DHG will also be added to substantiate the claim that withholding target data enables better domain-agnostic pattern learning. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical claims rest on benchmark comparisons, not self-referential derivations

full rationale

The provided text (abstract and description) introduces DHG as a combination of standard causal tools (do-operator, backdoor adjustment) with hierarchical constraints and reports numerical improvements on a lithium-ion battery extrapolation task. No equations, fitted parameters renamed as predictions, or self-citations are visible that would reduce any claimed result to its own inputs by construction. The counter-intuitive pretraining finding is stated as an observed outcome rather than a derived tautology, and the method description relies on external causal inference concepts without load-bearing self-references or ansatzes smuggled via prior author work. The derivation chain is therefore self-contained against the reported external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review; no explicit free parameters, axioms, or invented entities are detailed beyond the high-level description of DHG and causal operators.

pith-pipeline@v0.9.0 · 5521 in / 1086 out tokens · 32079 ms · 2026-05-11T01:47:32.853207+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

Reference graph

Works this paper leans on

34 extracted references · 34 canonical work pages · 3 internal anchors

  1. [1]

    Shortest-path flow matching with mixture-conditioned bases for OOD generalization

    Alejandro Almod´ ovar et al. Shortest-path flow matching with mixture-conditioned bases for OOD generalization. arXiv preprint arXiv:2601.11827 , 2026

  2. [2]

    De- CaFlow: A deconfounding causal generative model

    Alejandro Almod´ ovar, Adri´ an Javaloy, Juan Parras, Santiago Zazo, and Isabel V alera. De- CaFlow: A deconfounding causal generative model. In Advances in Neural Information Pro- cessing Systems (NeurIPS) , 2025

  3. [3]

    Bartlett and Shahar Mendelson

    Peter L. Bartlett and Shahar Mendelson. Rademacher and g aussian complexities: Risk bounds and structural results. Journal of Machine Learning Research , 3:463–482, 2002

  4. [4]

    Kochman n

    Jan-Hendrik Bastek, WaiChing Sun, and Dennis M. Kochman n. Physics-informed diffusion models. In Proceedings of the 12th International Conference on Learni ng Representations (ICLR), 2024

  5. [5]

    Dynaformer: A deep learning model for ageing-aware battery discharge pr ediction

    Luca Biggio, Tommaso Bendinelli, Chetan Kulkarni, and O lga Fink. Dynaformer: A deep learning model for ageing-aware battery discharge pr ediction. arXiv preprint arXiv:2206.02555, 2022

  6. [6]

    Universal approximation to nonlinear operators by neural networks with arbitrary activation functions and its appli cation to dynamical systems

    Tianping Chen and Hong Chen. Universal approximation to nonlinear operators by neural networks with arbitrary activation functions and its appli cation to dynamical systems. IEEE Transactions on Neural Networks, 6(4):911–917, 1995

  7. [7]

    A simple framework for contrastive learning of visual representations

    Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoff rey Hinton. A simple framework for contrastive learning of visual representations. In Proceedings of the 37th International Conference on Machine Learning (ICML) , pages 1597–1607, 2020

  8. [8]

    Coddington and Norman Levinson

    Earl A. Coddington and Norman Levinson. Theory of Ordinary Differential Equations . McGraw-Hill, 1955

  9. [9]

    V ariational physics-informed n eural operator (VINO) for solv- ing partial differential equations

    Mehmet Serhat Eshaghi, Cosmin Anitescu, Manish Thombre , Yizheng Wang, Xiaoying Zhuang, and Timon Rabczuk. V ariational physics-informed n eural operator (VINO) for solv- ing partial differential equations. Computer Methods in Applied Mechanics and Engineering , 437:117785, 2025

  10. [10]

    Lawrence C. Evans. Partial Differential Equations . American Mathematical Society, 2nd edition, 2010

  11. [11]

    Acceler- ated battery life testing dataset

    Kai Fricke, Rafael Nascimento, Marco Corbetta, Chetan Kulkarni, and Felipe Viana. Acceler- ated battery life testing dataset. NASA Prognostics Data Re pository, 2023

  12. [12]

    Gronwall

    Thomas H. Gronwall. Note on the derivatives with respec t to a parameter of the solutions of a system of differential equations. Annals of Mathematics , 20(4):292–296, 1919

  13. [13]

    GANs trained by a two time-scale update rule converge t o a local Nash equilibrium

    Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, B ernhard Nessler, and Sepp Hochre- iter. GANs trained by a two time-scale update rule converge t o a local Nash equilibrium. In Advances in Neural Information Processing Systems , volume 30, 2017

  14. [14]

    Rusu, Kieran Milan, John Quan, Tiago Ramalho, Agni eszka Grabska-Barwinska, Demis Hassabis, Claudia Clopath, Dharshan Kumaran, and Rai a Hadsell

    James Kirkpatrick, Razvan Pascanu, Neil Rabinowitz, J oel V eness, Guillaume Desjardins, Andrei A. Rusu, Kieran Milan, John Quan, Tiago Ramalho, Agni eszka Grabska-Barwinska, Demis Hassabis, Claudia Clopath, Dharshan Kumaran, and Rai a Hadsell. Overcoming catas- trophic forgetting in neural networks. Proceedings of the National Academy of Sciences , 114(1...

  15. [15]

    Fourier Neural Operator for Parametric Partial Differential Equations

    Zongyi Li, Nikola Kovachki, Kamyar Azizzadenesheli, B urigede Liu, Kaushik Bhattacharya, Andrew Stuart, and Anima Anandkumar. Fourier neural operat or for parametric partial differ- ential equations. arXiv preprint arXiv:2010.08895 , 2020

  16. [16]

    Neural Operator: Graph Kernel Network for Partial Differential Equations

    Zongyi Li, Nikola Kovachki, Kamyar Azizzadenesheli, B urigede Liu, Kaushik Bhattacharya, Andrew Stuart, and Anima Anandkumar. Neural operator: Grap h kernel network for partial differential equations. arXiv preprint arXiv:2003.03485 , 2020

  17. [17]

    Physics-infor med neural operator for learning partial differential equations

    Zongyi Li, Hongkai Zheng, Nikola Kovachki, David Jin, H aoxuan Chen, Burigede Liu, Kam- yar Azizzadenesheli, and Anima Anandkumar. Physics-infor med neural operator for learning partial differential equations. ACM/IMS Journal of Data Science , 1(3):1–27, 2024

  18. [18]

    Flow Matching for Generative Modeling

    Y aron Lipman, Ricky T.Q. Chen, Heli Ben-Hamu, Maximili an Nickel, and Matt Le. Flow matching for generative modeling. arXiv preprint arXiv:2210.02747 , 2022. 10

  19. [19]

    Learning nonlinear operators via DeepONet based on the universal app roximation theorem of operators

    Lu Lu, Pengzhan Jin, Guofei Pang, Zhongqiang Zhang, and George Em Karniadakis. Learning nonlinear operators via DeepONet based on the universal app roximation theorem of operators. Nature Machine Intelligence, 3:218–229, 2021

  20. [20]

    About the constants in talagrand’s con centration inequalities for empirical processes

    Pascal Massart. About the constants in talagrand’s con centration inequalities for empirical processes. The Annals of Probability , 28(2):863–884, 2000

  21. [21]

    Michael McCloskey and Neal J. Cohen. Catastrophic inte rference in connectionist networks: The sequential learning problem. In Psychology of Learning and Motivation, volume 24, pages 109–165. Academic Press, 1989

  22. [22]

    F oundations of Machine Learn- ing

    Mehryar Mohri, Afshin Rostamizadeh, and Ameet Talwalk ar. F oundations of Machine Learn- ing. MIT Press, 2nd edition, 2018

  23. [23]

    Causality: Models, Reasoning, and Inference

    Judea Pearl. Causality: Models, Reasoning, and Inference . Cambridge University Press, 2000

  24. [24]

    Causal inference by using invariant prediction: identification and confidence intervals

    Jonas Peters, Peter B¨ uhlmann, and Nicolai Meinshause n. Causal inference by using invariant prediction: identification and confidence intervals. Journal of the Royal Statistical Society: Series B, 78(5):947–1012, 2016

  25. [25]

    M´ emoire sur la th´ eorie des ´ equations diff´erentielles

    ´Emile Picard. M´ emoire sur la th´ eorie des ´ equations diff´erentielles. Journal de Math ´ematiques Pures et Appliqu ´ees, 6:145–210, 1890

  26. [26]

    Machine learning pipeline for battery state-of-health estimation

    Diego Roman, Saurabh Saxena, V alentin Robu, Michael Pe cht, and David Flynn. Machine learning pipeline for battery state-of-health estimation. Nature Machine Intelligence, 3(5):447– 456, 2021

  27. [27]

    Battery data set

    Bhaskar Saha and Kai Goebel. Battery data set. In NASA AMES Prognostics Data Repository, 2008

  28. [28]

    Edward H. Simpson. The interpretation of interaction i n contingency tables. Journal of the Royal Statistical Society, Series B , 13(2):238–241, 1951

  29. [29]

    Physics-inte grated variational autoencoders for ro- bust and interpretable generative modeling

    Naoya Takeishi and Alexandros Kalousis. Physics-inte grated variational autoencoders for ro- bust and interpretable generative modeling. In Advances in Neural Information Processing Systems (NeurIPS), 2021

  30. [30]

    BatteryLife: A comprehensive dataset and benchmark for battery life prediction

    Ruifeng Tan, Jiayuan Hong, Kai Wang, Jia Zhang, Jia Li, e t al. BatteryLife: A comprehensive dataset and benchmark for battery life prediction. In Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining , 2025. arXiv:2502.18807

  31. [31]

    Wavelet neural o perator for solving parametric partial differential equations in computational mechanics proble ms

    Tapas Tripura and Souvik Chakraborty. Wavelet neural o perator for solving parametric partial differential equations in computational mechanics proble ms. Computer Methods in Applied Mechanics and Engineering , 404:115783, 2023

  32. [32]

    Resp ecting causality for training physics- informed neural networks

    Sifan Wang, Shyam Sankaran, and Paris Perdikaris. Resp ecting causality for training physics- informed neural networks. Computer Methods in Applied Mechanics and Engineering , 421:116813, 2022

  33. [33]

    Gege Wen, Zongyi Li, Kamyar Azizzadenesheli, Anima Ana ndkumar, and Sally M. Benson. U-FNO: An enhanced Fourier neural operator-based deep-lea rning model for multiphase flow. Advances in W ater Resources, 163:104180, 2022

  34. [34]

    this waveform looks de- graded because the battery is old

    Chenxi Zhu, Xiao Xu, Jiawei Han, and Jintai Chen. Physic s-informed temporal alignment for auto-regressive PDE foundation models. In Proceedings of the 42nd International Conference on Machine Learning (ICML) , 2025. A Temperature Confounding in NASA Discharge Waveforms This appendix provides background for ML readers unfamilia r with battery electrochemi...