pith. sign in

arxiv: 2605.29849 · v1 · pith:MYUIJTG4new · submitted 2026-05-28 · 📡 eess.SY · cs.LG· cs.SY

BuilDyn: Excitation-Driven Data Generation for Building Thermal Dynamics Modeling and Control

Pith reviewed 2026-06-29 05:42 UTC · model grok-4.3

classification 📡 eess.SY cs.LGcs.SY
keywords building thermal dynamicsdata generationexcitation strategiesmachine learningcontrol modelingsimulationgeneralization
0
0 comments X

The pith

BuilDyn generates excited data that trains superior ML models for building thermal dynamics modeling and control.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents BuilDyn as a package that adds customizable excitation strategies to building simulation environments for generating training data. Existing datasets mostly capture buildings under fixed control policies, which leaves the system state space poorly explored and limits how well machine learning models generalize to new conditions. By actively varying inputs to cover more operating states, the generated data produces models that perform better on downstream tasks like thermal prediction and energy-efficient control. This approach is demonstrated by direct comparison on one building, showing gains from excited versus stationary data.

Core claim

BuilDyn enables sampling from representative building distributions and applies excitation strategies during data generation, and the resulting datasets train machine learning models that outperform those trained on non-excited stationary data for thermal dynamics modeling and control on the tested building.

What carries the argument

BuilDyn package providing customizable excitation strategies for control-oriented data generation from building simulations.

If this is right

  • Data-driven models achieve better robustness to unseen operating conditions in building control.
  • Control-oriented modeling becomes more scalable across different buildings and weather patterns.
  • Transfer learning and building-specific foundation models become more feasible with richer training data.
  • Fault detection and energy optimization tasks benefit from models that capture a wider range of dynamics.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The excitation approach could be adapted to other simulation domains where passive data collection leaves state spaces underexplored.
  • Real deployments might combine these strategies with actual building sensor streams to reduce reliance on purely simulated data.
  • Data collection standards in building research may shift toward deliberate excitation protocols instead of long-term stationary logging.

Load-bearing premise

That performance gains observed on one building will hold for other buildings and that the added excitations do not create unrealistic dynamics or simulation artifacts.

What would settle it

A comparison in which ML models trained on BuilDyn excited data from several buildings show no accuracy or control improvement over stationary-data models when tested on a new building under real operating conditions.

Figures

Figures reproduced from arXiv: 2605.29849 by Benjamin Sch\"afer, Benjamin Tischler, Fabian Raisch, Felix Koch, Thomas Krug.

Figure 1
Figure 1. Figure 1: Structure of the BuilDyn Python package. to dynamically manipulate control inputs and generate excited tra￾jectories for control-oriented modeling. For an in-depth description of BuilDa and its configurable FMU, we refer to [16]; additional de￾tails on the relationship between BuilDa and BuilDyn are provided in the appendix. 2.1 Architecture The BuilDyn package consists of two main components visualized in… view at source ↗
Figure 2
Figure 2. Figure 2: State-Exploration of BuilDyn with different excitation/control strategies [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Top row: ARC of ML-Models with differently excited/controlled training data. Green background indicates correct [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Heating load and indoor temperature distribution [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Sampling envelope parameter variations in [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: One day of excitation strategies that were used to create training data visualized for multiple seasons. [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Complementary Absolute Error and Action-Response Correctness Heatmaps for other excitation techniques. The "All [PITH_FULL_IMAGE:figures/full_fig_p008_7.png] view at source ↗
read the original abstract

Machine learning (ML) is increasingly used for data-driven modeling of buildings to enable downstream tasks such as fault detection and diagnosis, and energy-efficient control. While recent work improves generalization across building characteristics, weather, and occupancy, generalization also depends on sufficient exploration of the control-driven system state space. Existing real-world datasets and simulation environments predominantly reflect stationary operation under fixed control policies, resulting in limited excitation and reduced robustness to unseen operating conditions. This paper introduces BuilDyn, a package based on BuilDa that enables customizable excitation strategies for control-oriented data generation. BuilDyn further supports sampling from representative building distributions and provides a Python interface for easy integration into machine learning pipelines. We demonstrate the benefits of BuilDyn by comparing the performance of data-driven ML models trained on non-excited and excited data for one building. With BuilDyn, we hope to advance scalable control-oriented modeling and support future directions such as transfer learning and building-specific foundation models.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript introduces BuilDyn, a Python package extending BuilDa that enables customizable excitation strategies for generating control-oriented training data for building thermal dynamics. It supports sampling from representative building distributions and provides an interface for ML pipelines. The central claim is that ML models trained on data generated with these excitation strategies outperform models trained on non-excited stationary data, demonstrated via a comparison performed on a single building.

Significance. A well-validated tool for producing excited datasets could improve robustness of data-driven models for building control by better covering the relevant state space. The practical contribution of an open package with a Python interface is clear, but the single-building demonstration provides limited evidence that the reported performance lift is attributable to the excitation method rather than building-specific factors.

major comments (2)
  1. [Demonstration] The empirical demonstration is performed on only one building. Because the package is designed to sample from building distributions and the abstract emphasizes scalable modeling across characteristics, weather, and occupancy, a single-instance result does not establish that the performance improvement generalizes or is due to state-space coverage rather than simulation choices specific to that building.
  2. [Abstract] The abstract states that a comparison was performed but supplies no quantitative results, error metrics, or details on the excitation strategies used. Without these, it is impossible to judge whether the generated data actually supports the outperformance claim for modeling and control tasks.
minor comments (1)
  1. [Methods] Notation for the excitation strategies and the interface to BuilDa could be clarified with a short pseudocode or diagram in the methods section.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments. We address the two major comments point by point below, indicating planned revisions where appropriate.

read point-by-point responses
  1. Referee: [Demonstration] The empirical demonstration is performed on only one building. Because the package is designed to sample from building distributions and the abstract emphasizes scalable modeling across characteristics, weather, and occupancy, a single-instance result does not establish that the performance improvement generalizes or is due to state-space coverage rather than simulation choices specific to that building.

    Authors: We agree that a demonstration on a single building provides limited evidence for generalization across the sampled building distribution. The manuscript's primary contribution is the BuilDyn package and its customizable excitation strategies; the single-building comparison is presented as an illustrative case rather than a comprehensive validation of scalability. To address this, we will revise the experimental section to include results from at least three additional buildings drawn from the representative distribution, allowing a clearer assessment of whether the observed performance lift is attributable to the excitation method. revision: yes

  2. Referee: [Abstract] The abstract states that a comparison was performed but supplies no quantitative results, error metrics, or details on the excitation strategies used. Without these, it is impossible to judge whether the generated data actually supports the outperformance claim for modeling and control tasks.

    Authors: We acknowledge that the abstract omits specific quantitative results and excitation details, which are instead reported in the body of the manuscript. While abstracts are constrained by length, we will revise the abstract to incorporate the key error metrics (e.g., the relative improvement in prediction error for the excited versus non-excited datasets) and a brief characterization of the excitation strategies employed, thereby making the outperformance claim more self-contained. revision: yes

Circularity Check

0 steps flagged

No circularity: tool introduction with empirical demo, no derived claims

full rationale

The paper presents BuilDyn as a data-generation package extending BuilDa, with an empirical comparison of ML model performance on excited vs. non-excited data for a single building. No equations, fitted parameters, predictions, or uniqueness theorems are described that could reduce to inputs by construction. The contribution is a software tool and demonstration rather than a closed derivation chain, so none of the enumerated circularity patterns apply. The single-building scope is a limitation on generalization but does not constitute circularity.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The paper is a software tool contribution; it rests on the domain assumption that building thermal dynamics are amenable to data-driven modeling and that simulation environments can be excited in a representative way.

axioms (1)
  • domain assumption Building thermal dynamics can be adequately captured by data-driven models when sufficient state-space excitation is present.
    Stated in the motivation for needing excited data rather than stationary operation.

pith-pipeline@v0.9.1-grok · 5713 in / 1234 out tokens · 24962 ms · 2026-06-29T05:42:11.063790+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Real-world and simulated thermal data from 960 residential multi-zone buildings in Central Europe

    cs.DB 2026-06 unverdicted novelty 6.0

    The ThermBuild dataset supplies real and simulated 15-minute thermal data from 960 residential buildings for data-driven modeling of heating systems and indoor climate.

Reference graph

Works this paper leans on

34 extracted references · 13 canonical work pages · cited by 1 Pith paper · 3 internal anchors

  1. [1]

    2010.Experiments and Data for Building Energy Performance Analysis: Financed by The Danish Electricity Saving Trust

    Peder Bacher and Henrik Madsen. 2010.Experiments and Data for Building Energy Performance Analysis: Financed by The Danish Electricity Saving Trust. Technical University of Denmark, DTU Informatics, Building 321

  2. [2]

    Peder Bacher and Henrik Madsen. 2011. Identifying suitable models for the heat dynamics of buildings.Energy and Buildings43, 7 (July 2011), 1511–1522. doi:10.1016/j.enbuild.2011.02.005

  3. [3]

    Anaïs Berkes, Yoshua Bengio, David Rolnick, and Donna Vakalis. 2025. A HOT Dataset: 150,000 Buildings for HVAC Operations Transfer Research. InProceedings of the 12th ACM International Conference on Systems for Energy-Efficient Buildings, Cities, and Transportation. 171–180

  4. [4]

    David Blum, Javier Arroyo, Sen Huang, Ján Drgoňa, Filip Jorissen, Harald Taxt Walnum, Yan Chen, Kyle Benne, Draguna Vrabie, Michael Wetter, et al. 2021. Building optimization testing framework (BOPTEST) for simulation-based bench- marking of control strategies in buildings.Journal of Building Performance Simulation14, 5 (2021), 586–610

  5. [5]

    Wonjun Choi and Sangwon Lee. 2023. Performance evaluation of deep learning architectures for load and temperature forecasting under dataset size constraints and seasonality.Energy and buildings288 (2023), 113027

  6. [6]

    Drury B Crawley, Linda K Lawrie, Frederick C Winkelmann, Walter F Buhl, Y Joe Huang, Curtis O Pedersen, Richard K Strand, Richard J Liesen, Daniel E Fisher, Michael J Witte, et al . 2001. EnergyPlus: creating a new-generation building energy simulation program.Energy and buildings33, 4 (2001), 319–331

  7. [7]

    Jan Marco Ruiz de Vargas, Fabian Raisch, Zoltan Nagy, Pierre Pinson, and Christoph Goebel. 2026. Counter-Dyna: Data-Efficient RL-Based HVAC Control using Counterfactual Building Models.arXiv preprint arXiv:2605.04555(2026)

  8. [8]

    Hongwen Dou and Kun Zhang. 2025. Transfer learning for cross-building fore- casting of building energy and indoor air temperature in model predictive control applications.Journal of Building Engineering111 (2025), 113341

  9. [9]

    Hassan Harb, Neven Boyanov, Luis Hernandez, Rita Streblow, and Dirk Müller

  10. [10]

    Development and validation of grey-box models for forecasting the thermal response of occupied buildings.Energy and Buildings117 (2016), 199–207. doi:10. 1016/j.enbuild.2016.02.021

  11. [11]

    2023.Tracking Clean Energy Progress 2023

    IEA. 2023.Tracking Clean Energy Progress 2023. Technical Report. International Energy Agency. https://www.iea.org/reports/tracking-clean-energy-progress- 2023

  12. [12]

    Javier Jiménez-Raboso, Alejandro Campoy-Nieves, Antonio Manjavacas-Lucas, Juan Gómez-Romero, and Miguel Molina-Solana. 2021. Sinergym: a building simulation and control framework for training reinforcement learning agents. InProceedings of the 8th ACM International Conference on Systems for Energy- Efficient Buildings, Cities, and Transportation(Coimbra, ...

  13. [13]

    Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic opti- mization.arXiv preprint arXiv:1412.6980(2014)

  14. [14]

    S. A. Klein, W. A. Beckman, J. W. Mitchell, J. A. Duffie, T. L. Freeman, J. C. Mitchell, J. E. Braun, B. L. Evans, J. P. Kummer, R. E. Urban, A. Fiksel, J. W. Thornton, N. J. Blair, J. A. Beckman, and S. J. Klein. 2017.TRNSYS 18: A Transient System Simulation Program. Solar Energy Laboratory, University of Wisconsin, Madison, USA. http://sel.me.wisc.edu/trnsys

  15. [15]

    Michael Dahl Knudsen, Laurent Georges, Kristian Stenerud Skeie, and Steffen Petersen. 2021. Experimental test of a black-box economic model predictive control for residential space heating.Applied Energy298 (2021), 117227. doi:10. 1016/j.apenergy.2021.117227

  16. [16]

    Felix Koch, Fabian Raisch, and Benjamin Tischler. 2026. Thermal-GEMs: Gener- alized Models for Building Thermal Dynamics. InThe 13th ACM International Conference on Systems for Energy-Efficient Buildings, Cities, and Transportation. doi:10.1145/3744256.3812565

  17. [17]

    Thomas Krug, Fabian Raisch, Dominik Aimer, Markus Wirnsberger, Ferdinand Sigg, Felix Koch, Benjamin Schäfer, and Benjamin Tischler. 2025. A Highly Configurable Framework for Large-Scale Thermal Building Data Generation to drive Machine Learning Research.arXiv preprint arXiv:2512.00483(2025)

  18. [18]

    Thomas Krug, Fabian Raisch, Dominik Aimer, Markus Wirnsberger, Ferdinand Sigg, Benjamin Schäfer, and Benjamin Tischler. 2025. Builda: A thermal building data generation framework for transfer learning. In2025 Annual Modeling and Simulation Conference (ANNSIM). IEEE, 1–13

  19. [19]

    2011.Deutsche Gebäudetypolo- gie

    Tobias Loga, Nikolaus Diefenbach, and Rolf Born. 2011.Deutsche Gebäudetypolo- gie. Wohnen und Umwelt, Darmstadt, Germany

  20. [20]

    2022.Ecobee donate your data 1,000 homes in 2017

    Na Luo and Tianzhen Hong. 2022.Ecobee donate your data 1,000 homes in 2017. Technical Report. Pacific Northwest National Lab.(PNNL), Richland, WA (United States)

  21. [21]

    Henrik Madsen and JM Schultz. 1993. Short time determination of the heat dynamics of buildings. (1993)

  22. [22]

    Ozan Baris Mulayim, Pengrui Quan, Liying Han, Xiaomin Ouyang, Dezhi Hong, Mario Bergés, and Mani Srivastava. 2024. Are Time Series Foundation Models Ready to Revolutionize Predictive Building Analytics?. InProceedings of the 11th ACM International Conference on Systems for Energy-Efficient Buildings, Cities, and Transportation. 169–173

  23. [23]

    Giuseppe Pinto, Riccardo Messina, Han Li, Tianzhen Hong, Marco Savino Piscitelli, and Alfonso Capozzoli. 2022. Sharing is caring: An extensive analy- sis of parameter-based transfer learning for the prediction of building thermal dynamics.Energy and Buildings276 (2022), 112530

  24. [24]

    Martin Pullinger, Jonathan Kilgour, Nigel Goddard, Niklas Berliner, Lynda Webb, Myroslava Dzikovska, Heather Lovell, Janek Mann, Charles Sutton, Janette Webb, et al. 2021. The IDEAL household energy dataset, electricity, gas, contextual sensor data and survey data for 255 UK homes.Scientific Data8, 1 (2021), 146

  25. [25]

    Transfer Learning for Neural Parameter Estimation applied to Building RC Models

    Fabian Raisch, Timo Germann, J. Nathan Kutz, Christoph Goebel, and Benjamin Tischler. 2026. Transfer Learning for Neural Parameter Estimation applied to Building RC Models. arXiv:2604.05904 [eess.SY] https://arxiv.org/abs/2604.05904

  26. [26]

    Fabian Raisch, Thomas Krug, Christoph Goebel, and Benjamin Tischler. 2025. GenTL: A General Transfer Learning Model for Building Thermal Dynamics. In Proceedings of the 16th ACM International Conference on Future and Sustainable Energy Systems (E-Energy ’25). Association for Computing Machinery, New York, NY, USA, 322–333. doi:10.1145/3679240.3734589

  27. [27]

    Fabian Raisch, Max Langtry, Felix Koch, Ruchi Choudhary, Christoph Goebel, and Benjamin Tischler. 2026. Adapting to change: A comparison of continual and transfer learning for modeling building thermal dynamics under concept drifts.Energy and Buildings354 (2026), 116868. doi:10.1016/j.enbuild.2025.116868

  28. [28]

    Skeie, Laurent Georges, Michael D

    Igor Sartori, Harald Taxt Walnum, Kristian S. Skeie, Laurent Georges, Michael D. Knudsen, Peder Bacher, José Candanedo, Anna-Maria Sigounis, Anand Krishnan Prakash, Marco Pritoni, Jessica Granderson, Shiyu Yang, and Man Pun Wan. 2023. Sub-hourly measurement datasets from 6 real buildings: Energy use and indoor climate.Data in Brief(2023). doi:10.1016/j.di...

  29. [29]

    Rawisha Serasinghe, Nicholas Long, and Jordan D. Clark. 2024. Parameter identifi- cation methods for low-order gray box building energy models: A critical review. Energy and Buildings311 (2024), 114123. doi:10.1016/j.enbuild.2024.114123

  30. [30]

    Dassault Systèmes. 2023. FMPy. https://github.com/CATIA-Systems/FMPy. Accessed: 2025-11-26

  31. [31]

    Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need.Advances in neural information processing systems30 (2017)

  32. [32]

    Nouidui, and Xiufeng Pang

    Michael Wetter, Wangda Zuo, Thierry S. Nouidui, and Xiufeng Pang. 2014. Mod- elica Buildings library.Journal of Building Performance Simulation7, 4 (2014), 253–270. doi:10.1080/19401493.2013.765506

  33. [33]

    Shiyu Yang, Man Pun Wan, Wanyu Chen, Bing Feng Ng, and Swapnil Dubey

  34. [34]

    temperature

    Experiment study of machine-learning-based approximate model predictive control for energy-efficient building control.Applied Energy288 (2021), 116648. ACM Sustainability Week Companion ’26, June 22–25, 2026, Banff, AB, Canada Koch et al. A Extended BuilDyn Description BuilDyn is currently designed as a wrapper around BuilDa [ 16], also employing the Buil...