arxiv: 2605.04074 · v1 · submitted 2026-04-14 · 💻 cs.LG · cs.AI· cs.CE· cs.DC· cs.ET· cs.OS

Recognition: unknown

A Physics-Aware Framework for Short-Term GPU Power Forecasting of AI Data Centers

Mohammad AlShaikh Saleh , Sanjay Chawla , Sertac Bayhan , Haitham Abu-Rub , Ali Ghrayeb

Authors on Pith no claims yet

Pith reviewed 2026-05-10 15:22 UTC · model grok-4.3

classification 💻 cs.LG cs.AIcs.CEcs.DCcs.ETcs.OS

keywords physics-informed forecastingtime-series predictionGPU power forecastingAI data centersthermal RC networkNewton's law of coolingDLinear modelshort-term forecasting

0 comments

The pith

A physics-informed DLinear model forecasts AI data center GPU power 5-80 minutes ahead while respecting thermal dynamics and outperforming prior methods.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces PI-DLinear, the first DLinear-based time-series model that embeds physics from a multi-node lumped thermal resistance-capacitance network consistent with Newton's law of cooling. It derives time-dependent ODEs that explicitly link GPU power consumption to compute and memory utilization and temperature. On real AI data center traces, the model produces short-term forecasts that are more accurate than transformer and non-transformer baselines while remaining consistent with physical behavior during throttling and load changes. A reader would care because large, rapid swings in data-center power can destabilize the electricity grid, and forecasts that both reduce error and obey known thermal laws offer a practical way to anticipate those swings.

Core claim

We derive time-dependent ordinary differential equations from a multi-node lumped thermal resistance-capacitance network based on Newton's law of cooling; these ODEs interlink GPU power consumption, compute and memory utilization, and temperature. We incorporate the resulting physics constraints into a DLinear architecture to obtain PI-DLinear. When trained and evaluated on real AI data center measurements, PI-DLinear yields short-term forecasts (5-80 minutes) whose accuracy exceeds that of tested state-of-the-art models and whose profiles remain physically consistent under power throttling and load transients.

What carries the argument

PI-DLinear, formed by embedding newly derived time-dependent ODEs from a multi-node lumped thermal RC network into the DLinear time-series model so that power, utilization, and temperature predictions must satisfy the thermal dynamics.

If this is right

Forecasts remain physically consistent during power throttling and load transient events.
Averaged across look-back and prediction windows, accuracy improves by 0.782%-39.08% in MSE, 0.993%-51.82% in MAE, and 0.370%-22.28% in RMSE relative to tested SOTA models.
The approach supports reliable short-term power-demand predictions over horizons of 5 to 80 minutes on real AI data center data.
Better anticipation of power fluctuations helps mitigate risks to electricity-grid stability.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same ODE-based physics injection could be tested inside other linear or attention-based forecasters to see whether accuracy and consistency gains transfer.
Grid operators could feed these forecasts into real-time balancing algorithms to reduce the impact of AI-facility peaks.
Applying the thermal RC network to different hardware generations or cooling setups would test whether the derived equations remain valid beyond the original dataset.
Extending the model to include additional variables such as cooling-fan speed or ambient conditions might capture finer transient behavior.

Load-bearing premise

The multi-node lumped thermal resistance-capacitance network based on Newton's law of cooling correctly captures how GPU power, compute and memory utilization, and temperature evolve together through the derived ODEs.

What would settle it

If measured GPU power, utilization, and temperature traces during load transients show that the model's forecasts violate the temperature-power relationships required by the derived ODEs, the claim that the physics is respected would be falsified.

Figures

Figures reproduced from arXiv: 2605.04074 by Ali Ghrayeb, Haitham Abu-Rub, Mohammad AlShaikh Saleh, Sanjay Chawla, Sertac Bayhan.

**Figure 1.** Figure 1: Comparison of DLinear and PI-DLinear across power throttling, transient recovery, and post-event stability regimes. PI-DLinear consistently achieves lower prediction error, enabling accurate throttling characterization, faster recovery from sudden AI load changes, and stable forecasting behavior. ABSTRACT AI data centers experience rapid fluctuations in power demand due to the heterogeneity of computation… view at source ↗

**Figure 2.** Figure 2: PI-DLinear Architecture. The base DLinear model (top) decomposes the input look-back window X ∈ R L×C into seasonal/remainder (Xs) and trend (Xt) components, which are independently projected to the forecast horizon via linear layers Hs and Ht, then summed to produce the full multivariate forecast Yb ∈ R H×C , from which the power channel yb ∈ R H is extracted. Our physics-informed extension introduces thr… view at source ↗

**Figure 3.** Figure 3: Relationships observed in the MIT Supercloud dataset: power vs. utilization (left), power vs. temperature [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

**Figure 4.** Figure 4: Computational job distribution across the AI workloads present in the MIT Supercloud dataset, namely, vision [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗

**Figure 5.** Figure 5: PI-DLinear forecasting results shown as heatmaps for input sequence length [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗

**Figure 6.** Figure 6: Bubble chart showing model efficiency comparison under input-240-predict-80 for the top 4 models. PI [PITH_FULL_IMAGE:figures/full_fig_p011_6.png] view at source ↗

**Figure 7.** Figure 7: Equivalent RC thermal circuit for coupled GPU-Memory system. Current sources represent heat input from electrical power dissipation (P split by factor α). Capacitors represent thermal mass (ability to store thermal energy). Resistors represent thermal resistance to heat flow. Applying Kirchhoff’s Current Law at each node yields the governing ODEs. B.5 Derivation of Governing ODEs via Kirchhoff’s Current La… view at source ↗

**Figure 8.** Figure 8: Learned linear projection weights of PI-DLinear over the look-back window for each forecast step. The [PITH_FULL_IMAGE:figures/full_fig_p020_8.png] view at source ↗

read the original abstract

AI data centers experience rapid fluctuations in power demand due to the heterogeneity of computational tasks that they have to support. For example, the power profile of inference and training of large language models (LLMs) is quite distinct and big divergences can result in the instability of the underlying electricity grid. In this paper we propose, to the best of our knowledge, the first physics-informed DLinear time-series model that can accurately forecast power utilization of an AI data center 5-80 minutes (short-term forecasting) into the future. The physics, based on a multi-node lumped thermal resistance-capacitance (RC) network consistent with Newton's law of cooling, is captured using newly derived time-dependent ordinary differential equations (ODE) that separately models and interlinks power consumption with the GPU compute and memory utilization and temperature. The resulting model, that we refer to as PI-DLinear, trained and evaluated on a real AI data center dataset and is not only more accurate than the state-of-the-art (SOTA) models tested, but the forecast profile respects the underlying physics under power throttling and load transient events. Relative to the SOTA transformer-based and non-transformer-based models, improvements in forecasting accuracy (averaged across all look-back and prediction windows) range from 0.782%-39.08% for MSE, 0.993%-51.82% for MAE, and 0.370%-22.28% for RMSE.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper embeds newly derived thermal ODEs from an RC network into DLinear for GPU power forecasting, delivering accuracy gains over SOTA on real data while keeping outputs physically consistent under transients.

read the letter

The main point is that this work creates a physics-aware DLinear model for forecasting GPU power demand 5 to 80 minutes ahead in AI data centers. They derive time-dependent ODEs from a multi-node lumped RC thermal network based on Newton's cooling law, then fold those into the DLinear structure to link power, utilization, and temperature. This approach is new in combining the specific ODEs with DLinear for this task. The model trains on traces from a real AI data center and outperforms the tested SOTA models, with average improvements across horizons and lookbacks ranging from about 1% to 39% in MSE, similar for MAE and RMSE. It also produces forecast profiles that align with physical expectations during throttling and sudden load changes. The paper handles the integration cleanly, keeping the base model linear and stable while adding the physics component. The results include both numbers and plots that illustrate the behavior under transients. A minor soft spot is the reliance on the lumped RC approximation, which simplifies the actual distributed thermal dynamics in GPUs and servers. The parameters come from specs or fitting, so the physics guidance has some data-driven flexibility. The reported gains are averages, which can mask variation across different prediction windows. This is relevant for people working on energy efficiency and grid stability for AI workloads. It combines a simple architecture with domain knowledge in a way that could be useful for practical deployment. The combination of new derivation, real data evaluation, and physics consistency checks makes it worth a serious referee's time. I would send it to peer review.

Referee Report

0 major / 1 minor

Summary. The paper claims to introduce the first physics-informed DLinear (PI-DLinear) model for short-term (5-80 min) forecasting of GPU power utilization in AI data centers. It derives time-dependent ODEs from a multi-node lumped RC thermal network based on Newton's law of cooling to interlink power consumption, GPU compute/memory utilization, and temperature. The model is trained and tested on real AI data center data, showing improved accuracy over SOTA transformer and non-transformer models with average gains of 0.782%-39.08% in MSE, 0.993%-51.82% in MAE, and 0.370%-22.28% in RMSE across look-back and prediction windows, while ensuring forecasts respect physical constraints during power throttling and load transient events.

Significance. If the physics-informed component successfully enforces consistency without sacrificing accuracy, this framework could have significant impact on power management and grid stability for AI data centers handling variable workloads like LLM training and inference. The combination of a simple, stable DLinear backbone with physics constraints via ODEs offers an efficient alternative to complex transformers, and the use of real-world data strengthens the practical relevance. Credit is due for the explicit derivation of the ODEs and the focus on physically plausible outputs.

minor comments (1)

[Abstract] Grammatical issue in the sentence describing the model: 'The resulting model, that we refer to as PI-DLinear, trained and evaluated on a real AI data center dataset and is not only more accurate...' should be rephrased for clarity, e.g., 'The resulting model, referred to as PI-DLinear, is trained and evaluated on a real AI data center dataset and is not only more accurate...'.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive assessment of our work on the PI-DLinear model and for recommending minor revision. The recognition of the novelty in deriving time-dependent ODEs from the lumped RC thermal network, the efficiency of the DLinear backbone, and the practical value of real AI data center data is appreciated. No specific major comments were raised in the report.

Circularity Check

0 steps flagged

No significant circularity; derivation grounded in independent physics and data-driven training

full rationale

The paper derives time-dependent ODEs from Newton's law of cooling and a standard multi-node lumped RC network, which are external physical principles not defined in terms of the target forecasts. These ODEs interlink power, utilization, and temperature as an approximation whose parameters are fitted or taken from hardware specs. The PI-DLinear model embeds this physics-informed structure into the DLinear backbone and trains it on real data-center traces; reported accuracy gains are measured against external SOTA baselines using MSE/MAE/RMSE on held-out windows. No equation reduces the forecast output to a re-expression of the input data or a self-citation chain. The central claim therefore remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 0 invented entities

The central claim rests on the validity of the derived ODEs from a multi-node lumped RC thermal model. Because only the abstract is available, the exact number and values of any fitted parameters inside the DLinear component or RC network remain unknown.

free parameters (1)

RC network parameters
Thermal resistance and capacitance values in the multi-node model are likely either derived or fitted from data; abstract does not specify.

axioms (2)

domain assumption Newton's law of cooling governs heat transfer in the GPU system.
Basis for constructing the lumped RC network and deriving the time-dependent ODEs.
domain assumption A multi-node lumped-parameter approximation sufficiently captures GPU thermal dynamics for short-term forecasting.
Invoked to interlink power, utilization, and temperature in the ODEs.

pith-pipeline@v0.9.0 · 5591 in / 1542 out tokens · 48727 ms · 2026-05-10T15:22:10.372606+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

30 extracted references · 25 canonical work pages

[1]

ISBN 9798400700323

Association for Computing Machinery. ISBN 9798400700323. doi:10.1145/3575813.3595197. URL https://doi.org/10.1145/3575813.3595197. Mark Haranas. Google pours billions into new u.s. data centers: Here’s where,

work page doi:10.1145/3575813.3595197
[2]

Accessed: 2025-11-16

URL https://www.crn.com/ news/cloud/2024/google-pours-billions-into-new-u-s-data-centers-here-s-where . Accessed: 2025-11-16. Dan Swinhoe. Oracle’s larry ellison: We’re building out 100 data centers globally,

2024
[3]

com/2024/

URLhttps://www.cnbc. com/2024/. Accessed: 2025-11-16. Aurora Energy Research. Data center load growth in pjm,

2024
[4]

Accessed: 2025-11-17

URL https://auroraer.com/resources/ aurora-insights/market-reports/data-center-load-growth-in-pjm. Accessed: 2025-11-17. Liuzixuan Lin, Rajini Wijayawardana, Varsha Rao, Hai Nguyen, Emmanuel Wedan GNIBGA, and Andrew A. Chien. Exploding ai power use: an opportunity to rethink grid planning and management. InProceedings of the 15th ACM International Confere...

2025
[5]

ISBN 9798400704802

Association for Computing Machinery. ISBN 9798400704802. doi:10.1145/3632775.3661959. URL https://doi.org/10.1145/3632775.3661959. Schneider Electric and NVIDIA. Ai reference designs to enable adoption: A collaboration between schneider electric and nvidia. White paper, Schneider Electric,

work page doi:10.1145/3632775.3661959
[7]

Preprint at https://arxiv.org/abs/2409.11416 (2024)

URLhttps://arxiv.org/abs/2409.11416. Haixu Wu, Tengge Hu, Yong Liu, Hang Zhou, Jianmin Wang, and Mingsheng Long. Timesnet: Temporal 2d-variation modeling for general time series analysis,

work page arXiv
[8]

URLhttps://arxiv.org/abs/2210.02186. Siddharth Samsi, Matthew L Weiss, David Bestor, Baolin Li, Michael Jones, Albert Reuther, Daniel Edelman, William Arcand, Chansup Byun, John Holodnack, Matthew Hubbell, Jeremy Kepner, Anna Klein, Joseph McDonald, Adam Michaleas, Peter Michaleas, Lauren Milechin, Julia Mullen, Charles Yee, Benjamin Price, Andrew Prout, ...

work page arXiv
[9]

Maziar Raissi

URLhttps://arxiv.org/abs/2108.02037. Maziar Raissi. Deep hidden physics models: deep learning of nonlinear partial differential equations.J. Mach. Learn. Res., 19(1):932–955, January

work page arXiv
[10]

doi:https://doi.org/10.1016/j.jcp.2022.111722

ISSN 0021-9991. doi:https://doi.org/10.1016/j.jcp.2022.111722. URL https://www.sciencedirect.com/science/article/pii/S0021999122007859. George Amvrosiadis, Jae W. Park, Gregory R. Ganger, Garth A. Gibson, Ethan Baseman, and Nathan DeBardeleben. On the diversity of cluster workloads and its impact on research results. In2018 USENIX Annual Technical Confere...

work page doi:10.1016/j.jcp.2022.111722 2022
[11]

ISBN 9781450383172

Association for Computing Machinery. ISBN 9781450383172. doi:10.1145/3445814.3446760. URL https: //doi.org/10.1145/3445814.3446760. Grant Wilkins, Srinivasan Keshav, and Richard Mortier. Hybrid heterogeneous clusters can lower the energy consump- tion of llm inference workloads. InProceedings of the 15th ACM International Conference on Future and Sustaina...

work page doi:10.1145/3445814.3446760
[12]

ISBN 9798400704802

Association for Computing Machinery. ISBN 9798400704802. doi:10.1145/3632775.3662830. URLhttps://doi.org/10.1145/3632775.3662830. Sheng Wang, Shiping Chen, and Yumei Shi. Utilization-prediction-aware energy optimization approach for heteroge- neous gpu clusters.The Journal of Supercomputing, 80:9554–9578, May

work page doi:10.1145/3632775.3662830
[13]

Maurizio Rossi and Davide Brunelli

doi:https://doi.org/10.1007/s11227- 023-05807-x. Maurizio Rossi and Davide Brunelli. Forecasting data centers power consumption with the holt-winters method. In 2015 IEEE Workshop on Environmental, Energy, and Structural Monitoring Systems (EESMS) Proceedings, pages 210–214,

work page doi:10.1007/s11227- 2015
[14]

13 A Physics-Aware Framework for Short-Term Power Forecasting of AI Data CentersAPREPRINT David Meisner, Brian T

doi:10.1109/EESMS.2015.7175879. 13 A Physics-Aware Framework for Short-Term Power Forecasting of AI Data CentersAPREPRINT David Meisner, Brian T. Gold, and Thomas F. Wenisch. Powernap: eliminating server idle power. InProceedings of the 14th International Conference on Architectural Support for Programming Languages and Operating Systems, ASP- LOS XIV , p...

work page doi:10.1109/eesms.2015.7175879 2015
[15]

ISBN 9781605584065

Association for Computing Machinery. ISBN 9781605584065. doi:10.1145/1508244.1508269. URLhttps://doi.org/10.1145/1508244.1508269. Hayk Shoukourian and Dieter Kranzlmüller. Forecasting power-efficiency related key performance indicators for modern data centers using lstms.Future Generation Computer Systems, 112:362–382,

work page doi:10.1145/1508244.1508269
[16]

doi:https://doi.org/10.1016/j.future.2020.05.014

ISSN 0167-739X. doi:https://doi.org/10.1016/j.future.2020.05.014. URL https://www.sciencedirect.com/science/article/ pii/S0167739X20303964. Lu Bai, Weixing Ji, Qinyuan Li, Xilai Yao, Wei Xin, and Wanyi Zhu. Dnnabacus: Toward accurate computational cost prediction for deep neural networks,

work page doi:10.1016/j.future.2020.05.014 2020
[17]

Pratyush Patel, Esha Choukse, Chaojie Zhang, Íñigo Goiri, Brijesh Warrier, Nithish Mahalingam, and Ricardo Bianchini

URLhttps://arxiv.org/abs/2205.12095. Pratyush Patel, Esha Choukse, Chaojie Zhang, Íñigo Goiri, Brijesh Warrier, Nithish Mahalingam, and Ricardo Bianchini. Characterizing power management opportunities for llms in the cloud. InProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volu...

work page arXiv
[18]

ISBN 9798400703867

Association for Computing Machinery. ISBN 9798400703867. doi:10.1145/3620666.3651329. URLhttps://doi.org/10.1145/3620666.3651329. Qinghao Hu, Peng Sun, Shengen Yan, Yonggang Wen, and Tianwei Zhang. Characterization and prediction of deep learning workloads in large-scale gpu datacenters. InProceedings of the International Conference for High Performance C...

work page doi:10.1145/3620666.3651329
[19]

ISBN 9781450384421

Association for Computing Machinery. ISBN 9781450384421. doi:10.1145/3458817.3476223. URL https://doi.org/10.1145/ 3458817.3476223. Mariam Mughees, Yuzhuo Li, Yize Chen, and Yunwei Ryan Li. Short-term load forecasting for ai-data center,

work page doi:10.1145/3458817.3476223
[20]

Short-term load forecasting for ai-data center,

URLhttps://arxiv.org/abs/2503.07756. Tiechui Yao, Jue Wang, Yangang Wang, Pei Zhang, Haizhou Cao, Xuebin Chi, and Min Shi. Very short-term forecasting of distributed pv power using gstann.CSEE Journal of Power and Energy Systems, 10(4):1491–1501,

work page arXiv
[21]

Muhammad Aslam, Seung-Jae Lee, Sang-Hee Khang, and Sugwon Hong

doi:10.17775/CSEEJPES.2022.00110. Muhammad Aslam, Seung-Jae Lee, Sang-Hee Khang, and Sugwon Hong. Two-stage attention over lstm with bayesian optimization for day-ahead solar power forecasting.IEEE Access, 9:107387–107398,

work page doi:10.17775/cseejpes.2022.00110 2022
[22]

Mohammad Safayet Hossain and Hisham Mahmood

doi:10.1109/ACCESS.2021.3100105. Mohammad Safayet Hossain and Hisham Mahmood. Short-term photovoltaic power forecasting using an lstm neural network and synthetic weather forecast.IEEE Access, 8:172524–172533,

work page doi:10.1109/access.2021.3100105 2021
[23]

Jianjing Li, Chenghui Zhang, and Bo Sun

doi:10.1109/ACCESS.2020.3024901. Jianjing Li, Chenghui Zhang, and Bo Sun. Two-stage hybrid deep learning with strong adaptability for detailed day-ahead photovoltaic power forecasting.IEEE Transactions on Sustainable Energy, 14(1):193–205,

work page doi:10.1109/access.2020.3024901 2020
[24]

Qingsong Wen, Tian Zhou, Chaoli Zhang, Weiqi Chen, Ziqing Ma, Junchi Yan, and Liang Sun

doi:10.1109/TSTE.2022.3206240. Qingsong Wen, Tian Zhou, Chaoli Zhang, Weiqi Chen, Ziqing Ma, Junchi Yan, and Liang Sun. Transformers in time series: a survey. InProceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI ’23,

work page doi:10.1109/tste.2022.3206240 2022
[25]

Transformers in time series: a survey,

ISBN 978-1-956792-03-4. doi:10.24963/ijcai.2023/759. URL https://doi.org/10.24963/ijcai. 2023/759. Shiyang Li, Xiaoyong Jin, Yao Xuan, Xiyou Zhou, Wenhu Chen, Yu-Xiang Wang, and Xifeng Yan.Enhancing the locality and breaking the memory bottleneck of transformer on time series forecasting. Curran Associates Inc., Red Hook, NY , USA,

work page doi:10.24963/ijcai.2023/759 2023
[26]

14 A Physics-Aware Framework for Short-Term Power Forecasting of AI Data CentersAPREPRINT Tian Zhou, Ziqing Ma, Qingsong Wen, Xue Wang, Liang Sun, and Rong Jin

URL https: //arxiv.org/abs/2204.13767. 14 A Physics-Aware Framework for Short-Term Power Forecasting of AI Data CentersAPREPRINT Tian Zhou, Ziqing Ma, Qingsong Wen, Xue Wang, Liang Sun, and Rong Jin. Fedformer: Frequency enhanced decomposed transformer for long-term series forecasting. InInternational conference on machine learning, pages 27268–27286. PMLR,

work page arXiv
[27]

Are Transformers Effective for Time Series Forecasting?Proceedings of the AAAI Conference on Artificial Intelligence, 37(9): 11121–11128, 2023

ISBN 978-1-57735-880-0. doi:10.1609/aaai.v37i9.26317. URL https://doi.org/10.1609/aaai.v37i9.26317. Ulisses Braga-Neto. Physics-informed machine learning. InFundamentals of Pattern Recognition and Machine Learning, pages 293–324. Springer,

work page doi:10.1609/aaai.v37i9.26317
[28]

Hai-Peng Deng, Yan-Bo He, Bing-Chuan Wang, and Han-Xiong Li

doi:10.1109/ACCESS.2025.3591040. Hai-Peng Deng, Yan-Bo He, Bing-Chuan Wang, and Han-Xiong Li. Physics-dominated neural network for spa- tiotemporal modeling of battery thermal process.IEEE Transactions on Industrial Informatics, 20(1):452–460,

work page doi:10.1109/access.2025.3591040 2025
[29]

, Dku(x, t;w);λ −f(x, t) 2 dx

Lu(w) = Z Ω |ˆu(x, t)−u(x, t)|2 dx, Lr(w, λ) = Z Ω F u(x, t;w),x, t, Du(x, t;w), D2u(x, t;w), . . . , Dku(x, t;w);λ −f(x, t) 2 dx. (16) Lu is the mean-squared error (MSEu) that the NN incorporates to forecast the initial and boundary conditions, as well as utilizing training data for calibration represented via xi u, ti u, ui Nu i=1, with ˆudefined as the...

2023
[30]

, Dmu(xi r, ti r;w);λ −f(x i r, ti r) 2

Lu(w) .= 1 Nu NuX i=1 ˆu(xi u, ti u)−u(x i u, ti u) , Lr(w) .= 1 Nr NrX i=1 F u(xi r, ti r;w),x i r, ti r, Du(xi r, ti r;w), D2u(xi r, ti r;w), . . . , Dmu(xi r, ti r;w);λ −f(x i r, ti r) 2 . (17) 15 A Physics-Aware Framework for Short-Term Power Forecasting of AI Data CentersAPREPRINT B Methodology B.1 Problem Statement We address the problem of short-te...

work page arXiv 2058
[31]

8 presents the heatmaps of the trend and seasional components of the proposed PI-DLinear model

D Weight Visualization of PI-Dlinear Fig. 8 presents the heatmaps of the trend and seasional components of the proposed PI-DLinear model. The heatmaps can be interpreted as time-varying linear attribution maps, where for each forecast step t∈ {1, ..., T} (y-axis), PI- DLinear assigns a weight to each look-back/sequence length index k∈ {0, ..., L−1} (x-axi...

2050