pith. sign in

arxiv: 2606.10642 · v2 · pith:HXILC6Y2new · submitted 2026-06-09 · 💻 cs.LG · physics.ao-ph

PhysMetrics.Weather: An Evaluation Framework for Physical Consistency in ML Weather Models

Pith reviewed 2026-06-27 13:51 UTC · model grok-4.3

classification 💻 cs.LG physics.ao-ph
keywords machine learning weather predictionphysical consistencyevaluation frameworkconservation metricsspectral metricsdynamical metricsMLWP models
0
0 comments X

The pith

PhysMetrics.Weather supplies conservation, spectral, and dynamical metrics to test physical consistency in ML weather models.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Machine learning weather prediction models deliver strong numerical accuracy at low cost yet rely on data-driven training and pixel-level error scores that provide no guarantee of adherence to physical laws. The paper introduces PhysMetrics.Weather, an evaluation framework that applies three families of checks—conservation of quantities such as mass and energy, spectral distribution of variance across scales, and dynamical evolution rules—to quantify how physically realistic the model outputs are. This measurement allows developers to move beyond RMSE alone when judging whether forecasts remain consistent with known physics. If the framework works as intended, it can both steer the design of physics-informed model architectures and help decide which models are safe for operational forecasting.

Core claim

The paper claims that PhysMetrics.Weather quantifies the physical realism of MLWP models by scoring them on conservation metrics that verify preservation of physical quantities, spectral metrics that assess energy distribution across wavenumbers, and dynamical metrics that check consistency with expected time evolution, thereby providing a tool to guide physics-informed architecture development and to evaluate operational reliability beyond standard error metrics.

What carries the argument

The PhysMetrics.Weather framework, which scores model forecasts using three metric families—conservation, spectral, and dynamical—to quantify physical realism.

If this is right

  • Models can be compared and selected using physical consistency scores in addition to accuracy metrics such as RMSE.
  • Architecture choices for new MLWP models can be guided by iterative feedback from the conservation, spectral, and dynamical scores.
  • Operational deployment of ML weather forecasts can incorporate pass/fail thresholds from the framework to reduce risk of unphysical outputs.
  • Standardized reporting of these metrics enables direct comparison of physical fidelity across different MLWP approaches.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The metrics could be folded into training objectives so that physical consistency becomes an optimization target rather than only a post-hoc check.
  • The same three-family structure might transfer to other Earth-system or fluid-dynamics ML applications where realism beyond statistical fit matters.
  • Community extensions of the open framework could add further physical constraints, such as thermodynamic or radiative balance tests.

Load-bearing premise

That the three metric families together provide a sufficient and reliable test of whether an MLWP model is physically consistent enough for operational use.

What would settle it

An MLWP model that passes all three metric families at high levels yet produces forecasts that violate independent physical constraints in controlled real-world test cases would falsify the claim that the framework suffices for operational evaluation.

Figures

Figures reproduced from arXiv: 2606.10642 by Ana Lucic, Axel Lauer, Emma Kasteleyn, Pierre Gentine, Timo Maier, Veronika Eyring.

Figure 1
Figure 1. Figure 1: Overview of PhysMetrics.Weather. PhysMetrics.Weather evaluates models using nine metrics across three metric types: conservation of mass and energy (blue), spectral energy distribution (beige), and adherence to dynamical balance (green). Metric definitions are detailed in [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Global mass and energy conservation metrics calculated over a 240-hour prediction [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: KE spectra at 500 hPa. Top: Spectral metrics summary. Bottom: Spectra at 12h, 120h, and 240h lead times. Native resolutions: 0.25° (111.5 km), except NeuralGCM (0.7°, 315.2 km). Over extended forecasts, data-driven MLWPs exhibit spatial smoothing or artificial noise, whereas hybrid architectures (e.g., NeuralGCM) preserve realistic energy distributions. 8 [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Adherence to geostrophic wind balance, hydrostatic equilibrium and realistic atmo [PITH_FULL_IMAGE:figures/full_fig_p009_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Geometric derivation of the grid-cell area (Ai,j ) on a spherical Earth. Because the physical width of a cell (RE∆λ · cos ϕ) shrinks as it approaches the poles, a simple rectangular approximation overestimates polar areas. The true physical area is computed by integrating the differential element across the exact latitudinal bounds [ϕsouth, ϕnorth]. A.3 Global and Discrete Vertical Integration We define a … view at source ↗
Figure 6
Figure 6. Figure 6: Global budgets and time series for the conservation metrics over a 240-hour forecast [PITH_FULL_IMAGE:figures/full_fig_p019_6.png] view at source ↗
Figure 3
Figure 3. Figure 3: This structural alignment is confirmed by the improved [PITH_FULL_IMAGE:figures/full_fig_p019_3.png] view at source ↗
Figure 7
Figure 7. Figure 7: KE spectra at 500 hPa with an IFS HRES analysis as reference. Top: [PITH_FULL_IMAGE:figures/full_fig_p020_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Adherence to geostrophic wind balance, hydrostatic equilibrium and realistic at [PITH_FULL_IMAGE:figures/full_fig_p020_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Sensitivity of effective resolution to energy retention thresholds with ERA5 reference. [PITH_FULL_IMAGE:figures/full_fig_p021_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Sensitivity of effective resolution to the consecutive wavenumber requirement with [PITH_FULL_IMAGE:figures/full_fig_p021_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: KE spectra at 850 hPa with ERA5 reference. Top [PITH_FULL_IMAGE:figures/full_fig_p022_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: KE spectra at the 500 hPa Q spectrum with ERA5 reference. Top [PITH_FULL_IMAGE:figures/full_fig_p023_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Coefficient of determination (R2 ) and Spearman’s rank correlation (ρ) between standard predictive error (RMSE) and physical consistency metrics. The correlations are computed using the daily predictions of Pangu-Weather over the year 2020. Results are shown for 12-h, 120-h, and 240-h forecast lead times, comparing the physical metrics against the standard RMSE for Geopotential at 500 hPa (Z500) and Tempe… view at source ↗
Figure 14
Figure 14. Figure 14: Ablation study of global mass and energy conservation metrics using different [PITH_FULL_IMAGE:figures/full_fig_p024_14.png] view at source ↗
Figure 15
Figure 15. Figure 15: Ablation study of hydrostatic balance using dry versus virtual temperature over a [PITH_FULL_IMAGE:figures/full_fig_p025_15.png] view at source ↗
read the original abstract

Machine learning weather prediction (MLWP) models have achieved impressive forecasting performance at a small fraction of the computational costs required for traditional physics-based methods. However, they are primarily (1) data-driven and (2) evaluated using pixel-wide error metrics (e.g., RMSE), so there are no guarantees that their forecasts are consistent with known physical laws. We introduce PhysMetrics$.$Weather, an evaluation framework that assesses the physical realism of MLWP models across three types of metrics: conservation, spectral, and dynamical. By quantifying physical realism, this tool guides the development of physics-informed architectures and helps evaluate whether MLWP models are reliable for operational use. Our framework is available on Github at https://github.com/Emmakast/PhysMetrics.Weather.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 0 minor

Summary. The paper introduces PhysMetrics.Weather, an evaluation framework for machine learning weather prediction (MLWP) models. It defines three families of metrics—conservation, spectral, and dynamical—to quantify physical realism beyond standard pixel-wise error measures such as RMSE, with the goal of guiding physics-informed model development and assessing operational reliability. The associated code is released on GitHub.

Significance. A well-validated, open-source framework that systematically measures physical consistency in MLWP forecasts would be a useful contribution to the rapidly growing ML weather modeling literature. The absence of any reported implementation details, metric definitions, or validation experiments in the manuscript, however, prevents assessment of whether the proposed metrics actually deliver on that promise.

major comments (1)
  1. [Abstract] The central claim that the three metric families together 'provide a sufficient and reliable test' of physical consistency for operational use (abstract) is not supported by any derivation, pseudocode, or empirical validation within the manuscript. Without concrete definitions or tests against known physical violations, it is impossible to judge whether the framework is load-bearing for its stated purpose.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their review and for identifying the need for stronger evidentiary support in the manuscript. We address the major comment below and commit to revisions that add the requested concrete elements.

read point-by-point responses
  1. Referee: [Abstract] The central claim that the three metric families together 'provide a sufficient and reliable test' of physical consistency for operational use (abstract) is not supported by any derivation, pseudocode, or empirical validation within the manuscript. Without concrete definitions or tests against known physical violations, it is impossible to judge whether the framework is load-bearing for its stated purpose.

    Authors: We agree that the abstract phrasing risks implying sufficiency without accompanying detail, and that the current manuscript text (which focuses on high-level motivation and metric families) does not include explicit derivations, pseudocode, or validation experiments. This is a valid observation. We will revise the abstract to remove any suggestion of a complete or sufficient test and will add a new section that (1) provides formal definitions and pseudocode for each metric family, (2) reports implementation details, and (3) includes empirical validation on controlled cases with known physical violations (e.g., injected mass non-conservation or spectral artifacts). The GitHub repository already contains these components; the revision will surface them directly in the paper. revision: yes

Circularity Check

0 steps flagged

No significant circularity in framework definition

full rationale

The paper defines PhysMetrics.Weather as an evaluation framework consisting of conservation, spectral, and dynamical metrics to assess physical realism in ML weather prediction models. No derivation chain, fitted parameters, predictions, or load-bearing self-citations are present; the contribution is the introduction of the tool itself rather than any claim that reduces to its own inputs by construction. The work is self-contained as a definitional contribution with no equations or uniqueness theorems invoked that could create circularity.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only; no free parameters, axioms, or invented entities are described or can be inferred.

pith-pipeline@v0.9.1-grok · 5673 in / 934 out tokens · 15501 ms · 2026-06-27T13:51:16.996140+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

45 extracted references · 14 canonical work pages

  1. [1]

    ECMWF, 2024

    ECMWF.IFS Documentation CY49R1 - Part III: Dynamics and Numerical Procedures. ECMWF, 2024. doi: 10.21957/d04fb7a27e. URL https://www.ecmwf.int/en/elibrary/ 81625-ifs-documentation-cy49r1-part-iii-dynamics-and-numerical-procedures

  2. [2]

    Global forecast system (gfs) 1.0 degree

    NOAA. Global forecast system (gfs) 1.0 degree. Dataset, 2004. URL https://www.ncdc.noaa.gov/data-access/model-data/model-datasets/ global-forecast-system-gfs. NCEI DSI 6174

  3. [3]

    Can deep learning beat numerical weather prediction?Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 379(2194), 2021

    Martin G Schultz, Clara Betancourt, Bing Gong, Felix Kleinert, Michael Langguth, Lukas Hu- bert Leufen, Amirpasha Mozaffari, and Scarlet Stadtler. Can deep learning beat numerical weather prediction?Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 379(2194), 2021

  4. [4]

    FourCastNet: A Global Data- driven High-resolution Weather Model using Adaptive Fourier Neural Operators, February 2022

    Jaideep Pathak, Shashank Subramanian, Peter Harrington, Sanjeev Raja, Ashesh Chattopadhyay, Morteza Mardani, Thorsten Kurth, David Hall, Zongyi Li, Kamyar Azizzadenesheli, Pedram Hassanzadeh, Karthik Kashinath, and Animashree Anandkumar. FourCastNet: A Global Data- driven High-resolution Weather Model using Adaptive Fourier Neural Operators, February 2022...

  5. [5]

    Pangu-Weather: A 3D High-Resolution Model for Fast and Accurate Global Weather Forecast, November 2022

    Kaifeng Bi, Lingxi Xie, Hengheng Zhang, Xin Chen, Xiaotao Gu, and Qi Tian. Pangu-Weather: A 3D High-Resolution Model for Fast and Accurate Global Weather Forecast, November 2022. URLhttp://arxiv.org/abs/2211.02556. arXiv:2211.02556 [physics]

  6. [6]

    GraphCast: Learning skillful medium-range global weather forecasting, August 2023

    Remi Lam, Alvaro Sanchez-Gonzalez, Matthew Willson, Peter Wirnsberger, Meire Fortunato, Ferran Alet, Suman Ravuri, Timo Ewalds, Zach Eaton-Rosen, Weihua Hu, Alexander Merose, Stephan Hoyer, George Holland, Oriol Vinyals, Jacklynn Stott, Alexander Pritzel, Shakir Mohamed, and Peter Battaglia. GraphCast: Learning skillful medium-range global weather forecas...

  7. [7]

    doi:10.1038/s41586-025-09005-y

    Cristian Bodnar, Wessel P. Bruinsma, Ana Lucic, Megan Stanley, Anna Allen, Johannes Brandstetter, Patrick Garvan, Maik Riechert, Jonathan A. Weyn, Haiyu Dong, Jayesh K. Gupta, Kit Thambiratnam, Alexander T. Archibald, Chun-Chieh Wu, Elizabeth Heider, Max Welling, Richard E. Turner, and Paris Perdikaris. A foundation model for the Earth system.Nature, 641 ...

  8. [8]

    Copernicus Climate Data Store, accessed 2026-02-15, doi:10.24381/cds.adbb2d47

    H. Hersbach, B. Bell, P. Berrisford, G. Biavati, A. Horányi, J. Muñoz Sabater, J. Nicolas, C. Peubey, R. Radu, I. Rozum, D. Schepers, A. Simmons, C. Soci, D. Dee, and J.-N. Thépaut. ERA5 hourly data on single levels from 1940 to present. Copernicus Climate Change Service (C3S) Climate Data Store (CDS), 2023. URL https://doi.org/10.24381/cds.adbb2d47. 10

  9. [9]

    Bruinsma, Ana Lucic, Megan Stanley, Anna Vaughan, Johannes Brandstetter, Patrick Garvan, Maik Riechert, Jonathan A

    Cristian Bodnar, Wessel P. Bruinsma, Ana Lucic, Megan Stanley, Anna Vaughan, Johannes Brandstetter, Patrick Garvan, Maik Riechert, Jonathan A. Weyn, Haiyu Dong, Jayesh K. Gupta, Kit Thambiratnam, Alexander T. Archibald, Chun-Chieh Wu, Elizabeth Heider, Max Welling, Richard E. Turner, and Paris Perdikaris. A Foundation Model for the Earth System, November

  10. [10]

    arXiv:2405.13063 [physics]

    URLhttp://arxiv.org/abs/2405.13063. arXiv:2405.13063 [physics]

  11. [11]

    Hoffman, Zheng Liu, Jean-Francois Louis, and Christopher Grassoti

    Ross N. Hoffman, Zheng Liu, Jean-Francois Louis, and Christopher Grassoti. Distortion Representation of Forecast Errors.Monthly Weather Review, 123(9):2758–2770, September

  12. [12]

    doi: 10.1175/1520-0493(1995)123<2758:DROFE>2.0.CO

    ISSN 1520-0493, 0027-0644. doi: 10.1175/1520-0493(1995)123<2758:DROFE>2.0.CO

  13. [13]

    URL https://journals.ametsoc.org/view/journals/mwre/123/9/1520-0493_ 1995_123_2758_drofe_2_0_co_2.xml

  14. [14]

    Fixing the Double Penalty in Data-Driven Weather Forecasting Through a Modified Spherical Harmonic Loss Function, May 2025

    Christopher Subich, Syed Zahid Husain, Leo Separovic, and Jing Yang. Fixing the Double Penalty in Data-Driven Weather Forecasting Through a Modified Spherical Harmonic Loss Function, May 2025. URLhttp://arxiv.org/abs/2501.19374. arXiv:2501.19374 [cs]

  15. [15]

    Dacre, Andrew J

    Helen F. Dacre, Andrew J. Charlton-Perez, Simon Driscoll, Sue L. Gray, Ben Harvey, Na- talie J. Harvey, Kevin I. Hodges, Kieran M. R. Hunt, and Ambrogio V olontè. Northern hemisphere midlatitude cyclone intensity biases in machine learning weather prediction models.Bulletin of the American Meteorological Society, 107(1):E208–E221, 2026. doi: 10.1175/BAMS-...

  16. [16]

    Machine learning and physics in weather fore- casting: a discussion between alan thorpe and florian pappenberger.ECMWF – In F o- cus, jun 2024

    Florian Pappenberger and Alan Thorpe. Machine learning and physics in weather fore- casting: a discussion between alan thorpe and florian pappenberger.ECMWF – In F o- cus, jun 2024. URL https://www.ecmwf.int/en/about/media-centre/focus/2024/ machine-learning-and-physics-weather-forecasting-discussion-0

  17. [17]

    On Some Limitations of Current Machine Learning Weather Prediction Models.Geophysical Research Letters, 51(12):e2023GL107377,

    Massimo Bonavita. On Some Limitations of Current Machine Learning Weather Prediction Models.Geophysical Research Letters, 51(12):e2023GL107377,

  18. [18]

    On Some Limitations of Current Machine Learning Weather Prediction Models , volume =

    ISSN 1944-8007. doi: 10.1029/2023GL107377. URL https: //onlinelibrary.wiley.com/doi/abs/10.1029/2023GL107377. _eprint: https://agupubs.onlinelibrary.wiley.com/doi/pdf/10.1029/2023GL107377

  19. [19]

    and Scher, Sebastian and Weyn, Jonathan A

    Stephan Rasp, Peter D. Dueben, Sebastian Scher, Jonathan A. Weyn, Soukayna Mouatadid, and Nils Thuerey. WeatherBench: A Benchmark Data Set for Data-Driven Weather Forecasting. Journal of Advances in Modeling Earth Systems, 12(11):e2020MS002203, November 2020. ISSN 1942-2466. doi: 10.1029/2020MS002203. URL https://agupubs.onlinelibrary. wiley.com/doi/10.10...

  20. [20]

    WeatherBench 2: A benchmark for the next generation of data- driven global weather models, January 2024

    Stephan Rasp, Stephan Hoyer, Alexander Merose, Ian Langmore, Peter Battaglia, Tyler Russel, Alvaro Sanchez-Gonzalez, Vivian Yang, Rob Carver, Shreya Agrawal, Matthew Chantry, Zied Ben Bouallegue, Peter Dueben, Carla Bromberg, Jared Sisk, Luke Barrington, Aaron Bell, and Fei Sha. WeatherBench 2: A benchmark for the next generation of data- driven global we...

  21. [21]

    Weatherbench-x: A modular framework for evaluating weather forecasts

    Stephan Rasp et al. Weatherbench-x: A modular framework for evaluating weather forecasts. https://github.com/google-research/weatherbenchX, 2025

  22. [22]

    Deep learning and process understanding for data-driven earth system science.Nature, 566(7743):195–204, 2019

    Markus Reichstein, Gustau Camps-Valls, Bjorn Stevens, Martin Jung, Joachim Denzler, Nuno Carvalhais, and Prabhat. Deep learning and process understanding for data-driven earth system science.Nature, 566(7743):195–204, 2019. doi: 10.1038/s41586-019-0912-1

  23. [23]

    Enforcing analytic constraints in neural networks emulating physical systems.Phys

    Tom Beucler, Michael Pritchard, Stephan Rasp, Jordan Ott, Pierre Baldi, and Pierre Gentine. Enforcing analytic constraints in neural networks emulating physical systems.Phys. Rev. Lett., 126:098302, Mar 2021. doi: 10.1103/PhysRevLett.126.098302. URL https://link.aps. org/doi/10.1103/PhysRevLett.126.098302

  24. [24]

    brightbandtech/ExtremeWeatherBench, January 2026

    Brightband. brightbandtech/ExtremeWeatherBench, January 2026. URL https://github. com/brightbandtech/ExtremeWeatherBench. original-date: 2024-08-15T15:33:50Z. 11

  25. [25]

    Climatelearn: Benchmarking machine learning for weather and climate modeling, 2023

    Tung Nguyen, Jason Jewik, Hritik Bansal, Prakhar Sharma, and Aditya Grover. Climatelearn: Benchmarking machine learning for weather and climate modeling, 2023. URL https:// arxiv.org/abs/2307.01909

  26. [26]

    ChaosBench: A Multi-Channel, Physics-Based Benchmark for Subseasonal-to- Seasonal Climate Prediction, November 2024

    Juan Nathaniel, Yongquan Qu, Tung Nguyen, Sungduk Yu, Julius Busecke, Aditya Grover, and Pierre Gentine. ChaosBench: A Multi-Channel, Physics-Based Benchmark for Subseasonal-to- Seasonal Climate Prediction, November 2024. URL http://arxiv.org/abs/2402.00712. arXiv:2402.00712 [cs]

  27. [27]

    Strictly proper scoring rules, prediction, and estimation

    Tilmann Gneiting and Adrian E Raftery. Strictly proper scoring rules, prediction, and estimation. Journal of the American Statistical Association, 102(477):359–378, 2007. doi: 10.1198/ 016214506000001437

  28. [28]

    Why should ensemble spread match the RMSE of the ensemble mean?Journal of Hydrometeorology, 15(4):1708– 1713, 2014

    Vincent Fortin, Mohamed Abaza, François Anctil, and Richard Turcotte. Why should ensemble spread match the RMSE of the ensemble mean?Journal of Hydrometeorology, 15(4):1708– 1713, 2014. doi: 10.1175/JHM-D-14-0008.1

  29. [29]

    CMCC CMCC-CM2-VHR4 model output prepared for CMIP6 HighResMIP hist-1950

    Enrico Scoccimarro, Alessio Bellucci, and Daniele Peano. CMCC CMCC-CM2-VHR4 model output prepared for CMIP6 HighResMIP hist-1950. https://doi.org/10.22033/ESGF/ CMIP6.3818, 2018

  30. [30]

    Tom Dunstan, Oliver Strickson, Thusal Bennett, Jack Bowyer, Matthew Burnand, James Chap- pell, Alejandro Coca-Castro, Kirstine Ida Dale, Eric G. Daub, Noushin Eftekhari, Manvendra Janmaijaya, Jon Lillis, David Salvador-Jasin, Nathan Simpson, Ryan Sze-Yin Chan, Mohamad Elmasri, Lydia Allegranza France, Sam Madge, Levan Bokeria, Hannah Brown, Tom Dodds, Ann...

  31. [31]

    Assessing the Geographic Generalization and Physical Consistency of Generative Models for Climate Downscaling, October 2025

    Carlo Saccardi, Maximilian Pierzyna, Haitz Sáez de Ocáriz Borde, Simone Monaco, Cristian Meo, Pietro Liò, Rudolf Saathof, Geethu Joseph, and Justin Dauwels. Assessing the Geographic Generalization and Physical Consistency of Generative Models for Climate Downscaling, October 2025. URLhttp://arxiv.org/abs/2510.13722. arXiv:2510.13722 [cs]

  32. [32]

    Schreck, William Chapman, and David John Gagne

    Yingkai Sha, John S. Schreck, William Chapman, and David John Gagne. Improving AI weather prediction models using global mass and energy conservation schemes, January 2025. URL http://arxiv.org/abs/2501.05648. arXiv:2501.05648 [physics]

  33. [33]

    Incor- porating multivariate consistency in ml-based weather forecasting with latent-space constraints

    Hang Fan, Yi Xiao, Yongquan Qu, Fenghua Ling, Ben Fei, Lei Bai, and Pierre Gentine. Incor- porating multivariate consistency in ml-based weather forecasting with latent-space constraints. arXiv preprint arXiv:2510.04006, 2025. URLhttps://arxiv.org/abs/2510.04006

  34. [34]

    Neural general circulation models for weather and climate , volume =

    Dmitrii Kochkov, Janni Yuval, Ian Langmore, Peter Norgaard, Jamie Smith, Griffin Mooers, Milan Klöwer, James Lottes, Stephan Rasp, Peter Düben, Sam Hatfield, Peter Battaglia, Alvaro Sanchez-Gonzalez, Matthew Willson, Michael P. Brenner, and Stephan Hoyer. Neural general circulation models for weather and climate.Nature, 632(8027):1060–1066, August 2024. I...

  35. [35]

    ClimODE: Climate and Weather Forecast- ing with Physics-informed Neural ODEs, April 2024

    Yogesh Verma, Markus Heinonen, and Vikas Garg. ClimODE: Climate and Weather Forecast- ing with Physics-informed Neural ODEs, April 2024. URL http://arxiv.org/abs/2404. 10024. arXiv:2404.10024 [cs]

  36. [36]

    Imposing the Fundamental Dynamical Constraint of Hydrostatic Balance to Improve Global ML Weather Prediction, June 2025

    Akshay Subramaniam, Dale Durran, David Pruitt, Nathaniel Cresswell-Clay, and William Yik. Imposing the Fundamental Dynamical Constraint of Hydrostatic Balance to Improve Global ML Weather Prediction, June 2025. URL http://arxiv.org/abs/2506.08285. arXiv:2506.08285 [physics]

  37. [37]

    David Neelin, Deliang Chen, Jie Feng, Wei Han, Libo Wu, and Yuan Qi

    Xiuyu Sun, Xiaohui Zhong, Xiaoze Xu, Yuanqing Huang, Hao Li, J. David Neelin, Deliang Chen, Jie Feng, Wei Han, Libo Wu, and Yuan Qi. FuXi Weather: A data-to-forecast machine learning system for global weather, November 2024. URL http://arxiv.org/abs/2408. 05472. arXiv:2408.05472 [cs]. 12

  38. [38]

    Atmospheric model high resolution forecast (Set I - HRES)

    ECMWF. Atmospheric model high resolution forecast (Set I - HRES). ECMWF Forecasts Dataset Documentation, 2024. URL https://www.ecmwf.int/en/forecasts/datasets/ set-i. Accessed: 2024

  39. [39]

    The mass of the atmosphere: A constraint on global analyses.Journal of Climate, 18(6):864–875, 2005

    Kevin E Trenberth and Lesley Smith. The mass of the atmosphere: A constraint on global analyses.Journal of Climate, 18(6):864–875, 2005

  40. [40]

    G. D. Nastrom and K. S. Gage. A climatology of atmospheric wavenumber spectra of wind and temperature observed by commercial aircraft.Journal of the Atmospheric Sciences, 42(9): 950–960, 1985. doi: 10.1175/1520-0469(1985)042<0950:ACOAWS>2.0.CO;2

  41. [41]

    Academic press, 5th edition, 2012

    James R Holton and Gregory J Hakim.An Introduction to Dynamic Meteorology. Academic press, 5th edition, 2012

  42. [42]

    Stone and John H

    Peter H. Stone and John H. Carlson. Atmospheric lapse rate regimes and their parameterization. Journal of Atmospheric Sciences, 36(3):415 – 423, 1979. doi: 10.1175/1520-0469(1979) 036<0415:ALRRAT>2.0.CO;2. URL https://journals.ametsoc.org/view/journals/ atsc/36/3/1520-0469_1979_036_0415_alrrat_2_0_co_2.xml

  43. [43]

    Arch- esweather: An efficient ai weather forecasting model at 1.5° resolution, 2024

    Guillaume Couairon, Christian Lessig, Anastase Charantonis, and Claire Monteleoni. Arch- esweather: An efficient ai weather forecasting model at 1.5° resolution, 2024. URL https: //arxiv.org/abs/2405.14527

  44. [44]

    Andersson, Andrew El-Kadi, Do- minic Masters, Timo Ewalds, Jacklynn Stott, Shakir Mohamed, Peter Battaglia, Remi Lam, and Matthew Willson

    Ilan Price, Alvaro Sanchez-Gonzalez, Ferran Alet, Tom R. Andersson, Andrew El-Kadi, Do- minic Masters, Timo Ewalds, Jacklynn Stott, Shakir Mohamed, Peter Battaglia, Remi Lam, and Matthew Willson. GenCast: Diffusion-based ensemble forecasting for medium-range weather, May 2024. URLhttp://arxiv.org/abs/2312.15796. arXiv:2312.15796 [cs]

  45. [45]

    energy creation

    NOAA, NASA, and USAF.U.S. Standard Atmosphere, 1976. U.S. Government Printing Office, Washington, D.C., 1976. 13 A Appendix A: Granular Preprocessing and Integration Formalisms A.1 Boundary Derivation Strategies To compute the vertical column integrals detailed in the main text, a lower boundary (surface pressure, ps) is required. Because MLWP models exhi...