pith. machine review for the scientific record. sign in

arxiv: 2605.01362 · v1 · submitted 2026-05-02 · 📡 eess.SY · cs.SY

Recognition: unknown

Coordination Architecture Shapes Continuous Demand Response Outcomes in Building Districts

Ava Mohammadi, Rick Kramer, Zoltan Nagy

Authors on Pith no claims yet

Pith reviewed 2026-05-09 18:25 UTC · model grok-4.3

classification 📡 eess.SY cs.SY
keywords demand responsebuilding districtsmodel predictive controlreinforcement learninghybrid controlload trackingthermal comfortspatial variability
0
0 comments X

The pith

Hybrid control architecture balances load tracking, comfort, and equity better than centralized or decentralized alternatives in building districts.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper compares four coordination architectures for grid-integrated building districts that must track aggregated load profiles while preserving occupant comfort and equitable control distribution. It tests centralized model predictive control, decentralized reinforcement learning, multi-agent reinforcement learning, and a hybrid controller that assigns district-level battery decisions to MPC while leaving building-level HVAC to SAC, against a rule-based baseline. In a simulated 25-building residential district the hybrid method records the strongest combined results: 4.8 percent normalized mean bias error for tracking, 16.8 percent thermal comfort exceedance, and the lowest spatial variability of control actions. Architecture choice therefore sets the feasible trade-off surface rather than any single method dominating all metrics. Readers care because building clusters are increasingly required to deliver continuous demand response; knowing which coordination structure minimizes unwanted side effects helps designers avoid hidden comfort or fairness costs.

Core claim

Architecture choice determines the trade-off structure between tracking and comfort. Centralized MPC achieves low tracking bias of 8.8 percent NMBE yet concentrates actuation on a subset of buildings, producing 24.8 percent comfort exceedance and high spatial imbalance. Decentralized RL distributes control effort evenly but cannot sustain accurate tracking. The hybrid MPC-SAC controller, which separates district-level battery optimization from building-level HVAC regulation, achieves accurate tracking at 4.8 percent NMBE, moderate comfort impact at 16.8 percent exceedance, and the lowest spatial variability.

What carries the argument

The hybrid MPC-SAC controller that separates district-level battery optimization from building-level HVAC regulation.

If this is right

  • Centralized MPC tends to overload a few buildings, raising local comfort violations and spatial imbalance.
  • Pure decentralized RL spreads actions evenly but loses global tracking accuracy.
  • Hybrid separation enables district-scale decisions to remain precise while local dynamics are handled by learning.
  • The overall trade-off surface between tracking accuracy, comfort, and equity is architecture-dependent rather than fixed.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Designers of larger urban districts may therefore prefer hybrid structures to scale coordination without requiring full central models or perfect local information.
  • The separation of concerns could be tested on districts that include commercial buildings to check whether the same balance holds when thermal dynamics differ.
  • If the pattern persists, grid operators could prioritize hybrid pilots when both tracking precision and occupant acceptance matter.

Load-bearing premise

The simulation models of building thermal dynamics, HVAC systems, occupant behavior, and battery operation accurately represent real-world conditions and the chosen metrics fully capture relevant performance trade-offs.

What would settle it

Deploying the hybrid controller on a physical 25-building district and observing normalized mean bias error above 10 percent or comfort exceedance above 25 percent would falsify the reported performance advantage.

Figures

Figures reproduced from arXiv: 2605.01362 by Ava Mohammadi, Rick Kramer, Zoltan Nagy.

Figure 1
Figure 1. Figure 1: Coordination architectures in this study. The aggre view at source ↗
Figure 2
Figure 2. Figure 2: District net electricity load during a representative 3-day test period (February). The dashed line shows the reference view at source ↗
Figure 3
Figure 3. Figure 3: Probability density of building-level comfort ex view at source ↗
Figure 4
Figure 4. Figure 4: Net electricity consumption (top) and indoor temperature (bottom) for four representative buildings during a 24-hour view at source ↗
Figure 5
Figure 5. Figure 5: Trade-off between load tracking and thermal com view at source ↗
Figure 6
Figure 6. Figure 6: Mean net electricity load per building under each view at source ↗
Figure 8
Figure 8. Figure 8: Distribution of hourly spatial control variability, view at source ↗
Figure 9
Figure 9. Figure 9: Building-level changes in net electricity consumption relative to RBC, view at source ↗
Figure 10
Figure 10. Figure 10: Hourly spatial control variability measured as view at source ↗
read the original abstract

Grid-integrated building districts must provide energy flexibility while preserving occupant comfort and equitable distribution of control burden. We study how coordination architecture influences the ability of building clusters to track aggregated load profiles, comparing four paradigms: centralized model predictive control (MPC), decentralized independent reinforcement learning (SAC), centralized-training-decentralized-execution multi-agent RL (MAPPO), and a hybrid MPC--SAC controller that separates district-level battery optimization from building-level HVAC regulation. A rule-based controller serves as a baseline. We evaluate a 25-building residential district across three metrics: aggregate load tracking, thermal comfort, and spatial variability of control actions. We find that architecture choice determines the trade-off structure. Centralized MPC achieves low tracking bias (8.8% NMBE) but concentrates actuation on a subset of buildings, causing elevated comfort violations (24.8% exceedance) and spatial imbalance. Decentralized RL distributes control effort more evenly but fails to sustain accurate tracking. The hybrid architecture achieves the best balance: accurate tracking (4.8% NMBE), moderate comfort impact (16.8% exceedance), and the lowest spatial variability. These findings demonstrate that architecture choice determines the trade-off structure between tracking and comfort.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The paper conducts a simulation-based comparative study of coordination architectures for demand response in a 25-building residential district. It evaluates centralized MPC, decentralized SAC reinforcement learning, centralized-training MAPPO, a hybrid MPC-SAC controller (district-level battery optimization with building-level HVAC), and a rule-based baseline. Performance is assessed via aggregate load tracking (NMBE), thermal comfort (temperature exceedance percentage), and spatial variability of control actions. The central claim is that architecture choice determines the trade-off structure, with the hybrid achieving the strongest balance (4.8% NMBE, 16.8% comfort exceedance, lowest spatial variability), while centralized MPC shows good tracking but high comfort violations and imbalance, and decentralized RL distributes effort evenly but tracks poorly.

Significance. If the simulation results prove robust, the work would contribute to building energy management and demand response by demonstrating that coordination architecture is not neutral but actively shapes the achievable trade-offs between grid tracking, occupant comfort, and equitable actuation. The explicit multi-paradigm comparison with quantitative metrics provides actionable guidance for district-scale controllers. The hybrid design's separation of timescales is a concrete, implementable insight.

major comments (1)
  1. [Simulation Setup and Evaluation] The central claim that architecture choice determines the trade-off structure rests on performance numbers from the 25-building simulation (e.g., hybrid 4.8% NMBE and 16.8% exceedance). However, the manuscript provides no validation of the underlying thermal dynamics, HVAC, occupant behavior, or battery models against real data, nor sensitivity analysis to parameter uncertainty or unmodeled disturbances. This assumption is load-bearing: if model mismatch alters relative controller performance, the reported ordering and conclusion may not generalize. (Simulation Setup and Evaluation sections)
minor comments (2)
  1. [Abstract] Abstract: 'NMBE' and 'spatial variability' are used without definition or formula; these should be defined on first use to improve accessibility.
  2. [Methods] The manuscript would benefit from a table summarizing all controller hyperparameters, building parameters, and metric definitions to support reproducibility.

Simulated Author's Rebuttal

1 responses · 1 unresolved

We thank the referee for their constructive comments and for recognizing the potential value of our comparative study on coordination architectures. We address the major comment point by point below, with proposed revisions where feasible.

read point-by-point responses
  1. Referee: [Simulation Setup and Evaluation] The central claim that architecture choice determines the trade-off structure rests on performance numbers from the 25-building simulation (e.g., hybrid 4.8% NMBE and 16.8% exceedance). However, the manuscript provides no validation of the underlying thermal dynamics, HVAC, occupant behavior, or battery models against real data, nor sensitivity analysis to parameter uncertainty or unmodeled disturbances. This assumption is load-bearing: if model mismatch alters relative controller performance, the reported ordering and conclusion may not generalize. (Simulation Setup and Evaluation sections)

    Authors: We agree that the manuscript does not include direct validation of the models against real data from the 25-building district, as this is a simulation-based comparative study. The thermal dynamics use standard first-order RC equivalent circuit models with parameters drawn from typical residential building literature; HVAC and battery dynamics follow simplified linear and efficiency-based representations common in demand-response simulations; occupant behavior employs standard stochastic occupancy and setpoint schedules. All five controllers (centralized MPC, decentralized SAC, MAPPO, hybrid MPC-SAC, and rule-based) are evaluated under identical modeling assumptions, so the reported differences isolate architecture effects rather than model-specific artifacts. We acknowledge, however, that the absence of sensitivity analysis leaves open the possibility that parameter uncertainty could alter relative rankings. In the revised manuscript we will: (1) expand the Simulation Setup section with explicit discussion of model assumptions, their grounding in prior validated literature, and stated limitations; (2) add a sensitivity analysis subsection that perturbs key parameters (thermal capacitance, resistance, disturbance magnitudes, battery round-trip efficiency) across plausible ranges and reports the resulting changes in NMBE, comfort exceedance, and spatial variability. These additions will strengthen the claim that architecture shapes trade-offs while transparently qualifying the simulation scope. revision: partial

standing simulated objections not resolved
  • Empirical validation of the specific thermal, HVAC, occupant, and battery models against measured data from the 25-building district (no such real-world dataset was collected or available for this simulation study).

Circularity Check

0 steps flagged

No circularity: results follow from explicit controller implementations and simulation runs

full rationale

This is an empirical comparative simulation study. The central claims rest on running four explicitly defined controllers (centralized MPC, decentralized SAC, MAPPO, hybrid MPC-SAC) plus a rule-based baseline on a fixed 25-building thermal/HVAC/occupant/battery model, then computing three post-hoc metrics (NMBE, comfort exceedance percentage, spatial variability). No equations derive one performance number from another by algebraic identity or by fitting a parameter to a subset and relabeling the output as a prediction. No uniqueness theorem or ansatz is imported via self-citation to force the architecture ranking. The reported ordering (hybrid best balance, MPC low bias but high spatial imbalance, etc.) is an observed outcome of the simulation, not a definitional tautology. The derivation chain is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The paper is a simulation-based empirical comparison relying on standard building energy modeling assumptions rather than new derivations or entities.

axioms (1)
  • domain assumption Standard differential equation models for building thermal dynamics and HVAC systems are sufficiently accurate for controller comparison.
    Invoked implicitly when using MPC and RL controllers in the simulation.

pith-pipeline@v0.9.0 · 5512 in / 1223 out tokens · 36355 ms · 2026-05-09T18:25:54.196455+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

22 extracted references · 18 canonical work pages · 1 internal anchor

  1. [1]

    2022.Demand-side flexibility in the EU: Quantification of benefits in 2030

    Aurora Armenteros Saez, Hans de Heer, Laura Fiorini, Maria Miranda Castillo, and Thijs Slot. 2022.Demand-side flexibility in the EU: Quantification of benefits in 2030. Technical Report. smartEn, DNV

  2. [2]

    Javier Arroyo, Carlo Manna, Fred Spiessens, and Lieve Helsen. 2022. Reinforced model predictive control (RL-MPC) for building energy management.Applied Energy309 (March 2022), 118346. doi:10.1016/j.apenergy.2021.118346

  3. [3]

    McCulloch

    Flora Charbonnier, Thomas Morstyn, and Malcolm D. McCulloch. 2022. Coordina- tion of resources at the edge of the electricity grid: Systematic review and taxon- omy.Applied Energy318 (July 2022), 119188. doi:10.1016/j.apenergy.2022.119188

  4. [5]

    Vrabie, and Lieve Helsen

    Ján Drgoňa, Javier Arroyo, Iago Cupeiro Figueroa, David Blum, Krzysztof Arendt, Donghun Kim, Enric Perarnau Ollé, Juraj Oravec, Michael Wetter, Draguna L. Vrabie, and Lieve Helsen. 2020. All you need to know about model predictive control for buildings.Annual Reviews in Control50 (2020), 190–232. doi:10.1016/ j.arcontrol.2020.09.001

  5. [6]

    Rami El Geneidy and Bianca Howard. 2020. Contracted energy flexibility char- acteristics of communities: Analysis of a control strategy for demand response. Applied Energy263 (April 2020), 114600. doi:10.1016/j.apenergy.2020.114600

  6. [7]

    Tuomas Haarnoja, Aurick Zhou, Pieter Abbeel, and Sergey Levine. 2018. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor. doi:10.48550/arXiv.1801.01290 arXiv:1801.01290 [cs]

  7. [8]

    Parisa Hajialigol, Panayiotis Papadopoulos, Amin Moazami, and Mohammadreza Aghaei. 2026. A hierarchical energy management system for a cluster of buildings: Reinforcement learning and model predictive control (RL-MPC) approach.Energy and Buildings353 (Feb. 2026), 116879. doi:10.1016/j.enbuild.2025.116879

  8. [9]

    Maomao Hu, Fu Xiao, and Shengwei Wang. 2021. Neighborhood-level coor- dination and negotiation techniques for managing demand-side flexibility in residential microgrids.Renewable and Sustainable Energy Reviews135 (Jan. 2021), 110248. doi:10.1016/j.rser.2020.110248

  9. [10]

    Nicolas Lefebure, Mohammad Khosravi, Mathias Hudoba De Badyn, Felix Bün- ning, John Lygeros, Colin Jones, and Roy S. Smith. 2022. Distributed model predictive control of buildings and energy hubs.Energy and Buildings259 (March 2022), 111806. doi:10.1016/j.enbuild.2021.111806

  10. [11]

    Satchwell, Donal Finn, Toke Haunstrup Christensen, Michaël Kummert, Jérôme Le Dréau, Rui Amaral Lopes, Henrik Madsen, Jaume Salom, Gregor Henze, and Kim Wittchen

    Rongling Li, Andrew J. Satchwell, Donal Finn, Toke Haunstrup Christensen, Michaël Kummert, Jérôme Le Dréau, Rui Amaral Lopes, Henrik Madsen, Jaume Salom, Gregor Henze, and Kim Wittchen. 2022. Ten questions concerning energy flexibility in buildings.Building and Environment223 (Sept. 2022), 109461. doi:10. 1016/j.buildenv.2022.109461

  11. [12]

    Ryan Lowe, Yi Wu, Aviv Tamar, Jean Harb, Pieter Abbeel, and Igor Mordatch. 2017. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments. doi:10.48550/ARXIV.1706.02275 Version Number: 4

  12. [13]

    Wei Luo, Rick Kramer, Yvonne Kort, Pascal Rense, and Wouter Marken Lichten- belt. 2022. The effects of a novel personal comfort system on thermal comfort, physiology and perceived indoor environmental quality, and its health impli- cations - Stimulating human thermoregulation without compromising thermal comfort.Indoor Air32, 1 (Jan. 2022). doi:10.1111/ina.12951

  13. [14]

    Mathieu, Gregor Verbič, Thomas Morstyn, Mads Almassalkhi, Kyri Baker, Julio Braslavsky, Kenneth Bruninx, Yury Dvorkin, Gregory S

    Johanna L. Mathieu, Gregor Verbič, Thomas Morstyn, Mads Almassalkhi, Kyri Baker, Julio Braslavsky, Kenneth Bruninx, Yury Dvorkin, Gregory S. Ledva, Na- riman Mahdavi, Hrvoje Pandžić, Alessandra Parisio, and Vedran Perić. 2024. A New Definition of Demand Response in the Distributed Energy Resource Era. doi:10.48550/arXiv.2410.18768 arXiv:2410.18768 [eess]

  14. [15]

    Amirhosein Moshari, Kavan Javanroodi, and Vahid M. Nik. 2026. Real-world deployment of model-free reinforcement learning for energy control in district heating systems: Enhancing flexibility across neighboring buildings.Applied Energy402 (Jan. 2026), 126997. doi:10.1016/j.apenergy.2025.126997

  15. [16]

    Ouf, Se- ungjae Lee, Brodie W

    Zoltan Nagy, Burak Gunay, Clayton Miller, Jakob Hahn, Mohamed M. Ouf, Se- ungjae Lee, Brodie W. Hobson, Tareq Abuimara, Karol Bandurski, Maíra André, Clara-Larissa Lorenz, Sarah Crosby, Bing Dong, Zixin Jiang, Yuzhen Peng, Matteo Favero, June Young Park, Kingsley Nweye, Pedram Nojedehi, Helen Stopps, Lucile Sarran, Connor Brackley, Katherine Bassett, Kris...

  16. [17]

    Zoltan Nagy, Gregor Henze, Sourav Dey, Javier Arroyo, Lieve Helsen, Xiangyu Zhang, Bingqing Chen, Kadir Amasyali, Kuldeep Kurte, Ahmed Zamzam, Helia Zandi, Ján Drgoňa, Matias Quintana, Steven McCullogh, June Young Park, Han Li, Tianzhen Hong, Silvio Brandi, Giuseppe Pinto, Alfonso Capozzoli, Draguna Vrabie, Mario Bergés, Kingsley Nweye, Thibault Marzullo,...

  17. [19]

    Kingsley Nweye, Kathryn Kaspar, Giacomo Buscemi, Tiago Fonseca, Giuseppe Pinto, Dipanjan Ghose, Satvik Duddukuru, Pavani Pratapa, Han Li, Javad Moham- madi, Luis Lino Ferreira, Tianzhen Hong, Mohamed Ouf, Alfonso Capozzoli, and Zoltan Nagy. 2025. CityLearn v2: energy-flexible, resilient, occupant-centric, and carbon-aware management of grid-interactive co...

  18. [20]

    Kingsley Nweye, Zoltan Nagy, Sharada Mohanty, Dipam Chakraborty, Siva Sankaranarayanan, Tianzhen Hong, Sourav Dey, Sourav Dey, Gregor Henze, Jan Drgona, Fangquan Lin, Wei Jiang, Hanwei Zhang, Li Wang, Zhongkai Yi, Jihai Zhang, Cheng Yang, Matthew Motoki, Sorapong Khongnawang, Michael Ibrahim, Abilmansur Zhumabekov, Daniel May, Zhihu Yang, Xiaozhuang Song,...

  19. [22]

    Sabrina Savino, Tommaso Minella, Zoltán Nagy, and Alfonso Capozzoli. 2025. A scalable demand-side energy management control strategy for large residential districts based on an attention-driven multi-agent DRL approach.Applied Energy 393 (Sept. 2025), 125993. doi:10.1016/j.apenergy.2025.125993

  20. [23]

    João Soares, Fernando Lezama, Ricardo Faia, Steffen Limmer, Manuel Dietrich, Tobias Rodemann, Sergio Ramos, and Zita Vale. 2024. Review on fairness in local energy systems.Applied Energy374 (Nov. 2024), 123933. doi:10.1016/j.apenergy. 2024.123933

  21. [24]

    Chao Yu, Akash Velu, Eugene Vinitsky, Jiaxuan Gao, Yu Wang, Alexandre Bayen, and Yi Wu. 2022. The Surprising Effectiveness of PPO in Cooperative, Multi- Agent Games. doi:10.48550/arXiv.2103.01955 arXiv:2103.01955 [cs]

  22. [25]

    Schneider, B

    Bin Zhang, Weihao Hu, Amer M.Y.M. Ghias, Xiao Xu, and Zhe Chen. 2023. Multi-agent deep reinforcement learning based distributed control architecture for interconnected multi-energy microgrid energy management and optimiza- tion.Energy Conversion and Management277 (Feb. 2023), 116647. doi:10.1016/j. enconman.2022.116647