Toward a foundational thermal model for residential buildings
Pith reviewed 2026-05-09 14:08 UTC · model grok-4.3
The pith
A physics-informed transformer achieves accurate building temperature predictions and transfers to new buildings and climates with data from just two structures.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We present a physics-informed transformer architecture that embeds domain knowledge, e.g., derivative enrichment and Euler-based numerical integration, into a decoder-only framework. We incorporate static building features extracted from simulation models and employ Rotary Position Embedding attention to capture temporal dependencies. Evaluated on the CityLearn dataset spanning 247 residential buildings across three climate zones, our model achieves one-step prediction accuracy (RMSE of 0.30°C in Texas, 0.29°C in Vermont) while outperforming both traditional baselines and fine-tuned Time-Series Foundation Models. We also demonstrate zero-shot transferability: models trained on as few as two
What carries the argument
Physics-informed decoder-only transformer that adds derivative enrichment, Euler numerical integration, static building features, and Rotary Position Embeddings
If this is right
- A single model can serve many buildings without per-building recalibration.
- Training data from only two buildings is enough to reach new climate zones.
- Embedding physics improves results over both standard baselines and fine-tuned foundation models.
- This design supplies architectural principles that could support broader foundational building models.
Where Pith is reading between the lines
- If the simulation-to-reality gap proves small, pretrained models could be dropped into new constructions with almost no local data collection.
- The same embedding strategy might later support multi-step forecasts or direct optimization of heating and cooling setpoints.
- Real sensor streams could be used to fine-tune the simulation-trained weights, tightening the loop between model and physical building.
Load-bearing premise
The CityLearn simulated residential buildings capture the thermal behavior of real buildings and the added physics rules reflect universal principles rather than simulation-specific artifacts.
What would settle it
Apply the model to measured indoor temperature time series from actual occupied homes in Texas and Vermont and test whether one-step RMSE remains near 0.30°C.
Figures
read the original abstract
The building energy community lacks a foundational thermal model, i.e., a single pretrained model capable of generalizing across diverse buildings, climates, and control strategies without building-specific calibration. Achieving this vision requires architectural principles that capture universal thermal dynamics rather than memorizing building-specific patterns. We take a step toward this goal by presenting a physics-informed transformer architecture that embeds domain knowledge, e.g., derivative enrichment and Euler-based numerical integration, into a decoder-only framework. We incorporate static building features extracted from simulation models and employ Rotary Position Embedding attention to capture temporal dependencies. Evaluated on the CityLearn dataset spanning 247 residential buildings across three climate zones, our model achieves one-step prediction accuracy (RMSE of 0.30{\deg}C in Texas, 0.29{\deg}C in Vermont) while outperforming both traditional baselines and fine-tuned Time-Series Foundation Models. We also demonstrate zero-shot transferability: models trained on as few as two buildings generalize to unseen buildings and climate zones without fine-tuning. Despite the limitation of simulated residential buildings, our results establish physics-informed architectural principles as a promising foundation for universal building thermal models.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims to take a step toward a foundational thermal model for residential buildings by proposing a physics-informed decoder-only transformer architecture. This model incorporates derivative enrichment, Euler-based numerical integration, static building features from simulations, and Rotary Position Embeddings. Evaluated on the CityLearn dataset with 247 residential buildings across three climate zones, it reports one-step prediction RMSE values of 0.30°C in Texas and 0.29°C in Vermont, outperforming traditional baselines and fine-tuned Time-Series Foundation Models. Additionally, it demonstrates zero-shot transferability, where models trained on as few as two buildings generalize to unseen buildings and climate zones without fine-tuning. The work acknowledges the use of simulated data as a limitation.
Significance. If the results hold, this would represent a meaningful contribution to the development of generalizable models in building energy management. The embedding of physics knowledge into the transformer architecture and the demonstration of strong zero-shot performance with minimal training data are strengths that could advance the field toward universal thermal models. The specific quantitative results provide a clear benchmark for future work. However, the significance is constrained by the exclusive reliance on a single simulation environment, which may not fully capture real-world complexities such as sensor noise or unmodeled dynamics.
major comments (2)
- [Abstract and Methods] The abstract provides specific RMSE numbers and zero-shot results, but the manuscript lacks details on the training procedure, baseline implementations, data splits, and statistical significance. This is critical as these elements are load-bearing for validating the central claims of outperforming baselines and achieving generalization.
- [Discussion or Limitations] The paper notes the limitation of simulated buildings but does not quantify the sim-to-real transfer gap or test robustness to different simulation assumptions. Since the physics components are aligned with the CityLearn engine, this risks the observed generalization being simulator-specific rather than universal, which is central to the foundational model claim.
minor comments (1)
- [Abstract] Consider specifying the exact number of buildings and climate zones used in the zero-shot experiments for clarity.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback. The comments highlight important areas for strengthening reproducibility and the discussion of limitations. We address each point below and will revise the manuscript accordingly.
read point-by-point responses
-
Referee: [Abstract and Methods] The abstract provides specific RMSE numbers and zero-shot results, but the manuscript lacks details on the training procedure, baseline implementations, data splits, and statistical significance. This is critical as these elements are load-bearing for validating the central claims of outperforming baselines and achieving generalization.
Authors: We agree that the manuscript would benefit from expanded methodological details to support reproducibility and the central claims. In the revised version, we will add a dedicated Experimental Setup subsection that specifies the full training procedure (including optimizer, learning rate schedule, batch size, number of epochs, and regularization), complete baseline implementations (with exact configurations for ARIMA, LSTM, and the fine-tuned Time-Series Foundation Models, including any hyperparameter search), explicit data split protocols (detailing the selection of the minimal training sets of two buildings and the zero-shot evaluation across the remaining buildings and climate zones), and statistical significance testing (such as standard deviations over multiple random seeds and paired statistical tests on RMSE differences). These additions will directly address the load-bearing elements of the claims. revision: yes
-
Referee: [Discussion or Limitations] The paper notes the limitation of simulated buildings but does not quantify the sim-to-real transfer gap or test robustness to different simulation assumptions. Since the physics components are aligned with the CityLearn engine, this risks the observed generalization being simulator-specific rather than universal, which is central to the foundational model claim.
Authors: This is a valid concern that directly impacts the strength of the foundational model claim. Because the work relies exclusively on CityLearn simulations, we cannot quantify the sim-to-real gap or perform cross-simulator robustness tests without new data sources. In the revision, we will expand the Limitations and Discussion sections to explicitly discuss the alignment of our physics-informed components with general thermal dynamics (rather than CityLearn-specific assumptions), acknowledge the risk of simulator-specific generalization, and outline concrete directions for future validation on real-world building data or alternative simulators. The zero-shot transfer results across climate zones within the current environment remain encouraging evidence of broader applicability, but we agree that additional robustness analysis is required. revision: partial
Circularity Check
No significant circularity; empirical results on held-out simulator data
full rationale
The paper trains a decoder-only transformer with embedded physics components (derivative enrichment, Euler integration, static features from the simulator, RoPE) on CityLearn data and reports one-step RMSE plus zero-shot transfer on held-out buildings and climate zones. These are measured prediction errors on independent test splits, not quantities that reduce to the training inputs or fitted parameters by construction. The architecture is an explicit design choice rather than a self-definition that forces the reported accuracy. No load-bearing self-citations, uniqueness theorems, or renamings of known results appear in the abstract or claims. The sim-to-real limitation is acknowledged but does not create definitional circularity within the reported experiments.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Thermal dynamics of residential buildings can be usefully approximated by embedding derivative terms and Euler numerical integration inside a neural network
Reference graph
Works this paper leans on
-
[1]
Tianqi Chen. 2016. XGBoost: A Scalable Tree Boosting System.Cornell University (2016)
work page 2016
-
[2]
Drury B Crawley, Linda K Lawrie, Frederick C Winkelmann, Walter F Buhl, Y Joe Huang, Curtis O Pedersen, Richard K Strand, Richard J Liesen, Daniel E Fisher, Michael J Witte, et al . 2001. EnergyPlus: creating a new-generation building energy simulation program.Energy and buildings33, 4 (2001), 319–331
work page 2001
-
[3]
Abhimanyu Das, Weihao Kong, Rajat Sen, and Yichen Zhou. 2024. A decoder- only foundation model for time-series forecasting. InForty-first International Conference on Machine Learning. Toward a foundational thermal model for residential buildings BUILDSYS’26, June 22–25, 2026, Banff, Alberta, Canada Figure 3: Monthly comparison between different regions an...
work page 2024
-
[4]
Davide Deltetto. 2020.Data-driven coordinated building cluster energy manage- ment to enhance energy efficiency, comfort and grid stability. Ph. D. Dissertation. Politecnico di Torino
work page 2020
-
[5]
Jihoon Jang, Jinmog Han, and Seung-Bok Leigh. 2022. Prediction of heating en- ergy consumption with operation pattern variables for non-residential buildings using LSTM networks.Energy and Buildings255 (2022), 111647
work page 2022
-
[6]
Byung-Ki Jeon and Eui-Jong Kim. 2021. LSTM-based model predictive control for optimal temperature set-point planning.Sustainability13, 2 (2021), 894
work page 2021
-
[7]
Diederik P Kingma. 2014. Adam: A method for stochastic optimization.arXiv preprint arXiv:1412.6980(2014)
work page internal anchor Pith review Pith/arXiv arXiv 2014
-
[8]
Nikola Kovachki, Zongyi Li, Burigede Liu, Kamyar Azizzadenesheli, Kaushik Bhattacharya, Andrew Stuart, and Anima Anandkumar. 2023. Neural operator: Learning maps between function spaces with applications to pdes.Journal of Machine Learning Research24, 89 (2023), 1–97
work page 2023
-
[9]
Bryan Lim, Sercan Ö Arık, Nicolas Loeff, and Tomas Pfister. 2021. Temporal fusion transformers for interpretable multi-horizon time series forecasting.International journal of forecasting37, 4 (2021), 1748–1764
work page 2021
-
[10]
Huiming Lu, Jiazheng Wu, Yingjun Ruan, Fanyue Qian, Hua Meng, Yuan Gao, and Tingting Xu. 2023. A multi-source transfer learning model based on LSTM and domain adaptation for building energy prediction.International Journal of Electrical Power & Energy Systems149 (2023), 109024
work page 2023
-
[11]
Morteza Mardani, Noah Brenowitz, Yair Cohen, Jaideep Pathak, Chieh-Yu Chen, Cheng-Chin Liu, Arash Vahdat, Mohammad Amin Nabian, Tao Ge, Akshay Subramaniam, et al. 2025. Residual corrective diffusion modeling for km-scale atmospheric downscaling.Communications Earth & Environment6, 1 (2025), 124. BUILDSYS’26, June 22–25, 2026, Banff, Alberta, Canada Ting-Y...
work page 2025
-
[12]
Ozan Baris Mulayim, Pengrui Quan, Liying Han, Xiaomin Ouyang, Dezhi Hong, Mario Bergés, and Mani Srivastava. 2024. Are Time Series Foundation Models Ready to Revolutionize Predictive Building Analytics?. InProceedings of the 11th ACM International Conference on Systems for Energy-Efficient Buildings, Cities, and Transportation. 169–173
work page 2024
-
[13]
Kingsley Nweye, Kathryn Kaspar, Giacomo Buscemi, Tiago Fonseca, Giuseppe Pinto, Dipanjan Ghose, Satvik Duddukuru, Pavani Pratapa, Han Li, Javad Moham- madi, et al. 2025. CityLearn v2: energy-flexible, resilient, occupant-centric, and carbon-aware management of grid-interactive communities.Journal of Building Performance Simulation18, 1 (2025), 17–38
work page 2025
-
[14]
Young Jin Park, Francois Germain, Jing Liu, Ye Wang, Toshiaki Koike-Akino, Gor- don Wichern, Navid Azizan, Christopher R Laughman, and Ankush Chakrabarty
- [15]
-
[16]
2024.ResStock Dataset 2024.1 Documentation
Elaina Present, Philip R White, Chioke Harris, Rajendra Adhikari, Yingli Lou, Lixi Liu, Anthony Fontanini, Christopher Moreno, Joseph Robertson, and Jeff Maguire. 2024.ResStock Dataset 2024.1 Documentation. Technical Report. National Renewable Energy Laboratory (NREL), Golden, CO (United States)
work page 2024
-
[17]
Ilan Price, Alvaro Sanchez-Gonzalez, Ferran Alet, Tom R Andersson, Andrew El-Kadi, Dominic Masters, Timo Ewalds, Jacklynn Stott, Shakir Mohamed, Peter Battaglia, et al. 2025. Probabilistic weather forecasting with machine learning. Nature637, 8044 (2025), 84–90
work page 2025
-
[18]
Jianlin Su, Murtadha Ahmed, Yu Lu, Shengfeng Pan, Wen Bo, and Yunfeng Liu. 2024. Roformer: Enhanced transformer with rotary position embedding. Neurocomputing568 (2024), 127063
work page 2024
-
[19]
Lingfeng Tang, Haipeng Xie, Xiaoyang Wang, and Zhaohong Bie. 2023. Privacy- preserving knowledge sharing for few-shot building energy prediction: A feder- ated learning approach.Applied Energy337 (2023), 120860
work page 2023
-
[20]
Global Alliance for Buildings United Nations Environment Programme and Con- struction. 2025-03. Not just another brick in the wall: The solutions exist - Scaling them will build on progress and cut emissions fast. Global Status Report for Build- ings and Construction 2024/2025. https://wedocs.unep.org/20.500.11822/47214
work page 2025
-
[21]
Florian Wiesner, Matthias Wessling, and Stephen Baek. 2025. Towards a physics foundation model.arXiv preprint arXiv:2509.13805(2025). Toward a foundational thermal model for residential buildings BUILDSYS’26, June 22–25, 2026, Banff, Alberta, Canada Figure 5: Cross region comparison between the multi-region trained model and single region model. BUILDSYS’...
work page internal anchor Pith review arXiv 2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.