AIFS-DOP: End-to-End Medium-Range Weather Prediction from Observations Alone with Machine Learning

Anthony McNally; Eulalie Boucher; Ewan Pinnington; Gert Mertes; Matthew Chantry; Mihai Alexe; Patricia de Rosnay; Patrick Laloyaux; Peter Lean; Simon Lang

arxiv: 2606.19093 · v1 · pith:ZA3EPDJLnew · submitted 2026-06-17 · ⚛️ physics.ao-ph

AIFS-DOP: End-to-End Medium-Range Weather Prediction from Observations Alone with Machine Learning

Ewan Pinnington , Peter Lean , Mihai Alexe , Eulalie Boucher , Simon Lang , Patrick Laloyaux , Gert Mertes , Tomas Kral

show 3 more authors

Patricia de Rosnay Matthew Chantry Anthony McNally

This is my paper

Pith reviewed 2026-06-26 18:51 UTC · model grok-4.3

classification ⚛️ physics.ao-ph

keywords weather forecastingmachine learningdirect observation predictionmedium-range forecastsgridded observationsECMWF IFSdata-driven modelingharmonized observations

0 comments

The pith

A machine learning model trained only on gridded observations matches IFS performance at medium ranges when verified against real data.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces AIFS-DOP, a system that learns medium-range weather forecasts directly from a 40-year record of harmonized observations and produces no dependence on numerical weather prediction reanalysis or model output during training. It reports that the resulting forecasts are competitive with ECMWF's IFS on several upper-air and surface headline scores over the independent 2021/2022 verification year. A reader would care because the result shows that end-to-end data-driven prediction can reach operational levels without the conventional pipeline of physics-based modeling and data assimilation. The central demonstration is therefore that observation-only training is now sufficient for competitive medium-range skill when scores are computed against withheld observations rather than against reanalysis fields.

Core claim

AIFS-DOP is trained on a 40-year harmonized dataset of gridded observations without using NWP reanalysis or model data. The resulting model is competitive with ECMWF's Integrated Forecasting System when scored on a one-year period of forecasts across 2021/2022. This progress on Direct Observation Prediction represents the first time that a data-driven model, trained solely on observations, is competitive with the IFS at medium ranges for several key upper-air and surface headline scores, when verified against observation data.

What carries the argument

AIFS-DOP, an end-to-end machine learning model trained exclusively on harmonized gridded observations to produce direct observation predictions.

If this is right

Medium-range forecasts can be generated without any input from numerical weather prediction reanalysis or model fields during either training or inference.
Verification can be performed directly against withheld observations rather than against reanalysis products.
The same observation-only training procedure yields competitive scores on both upper-air and surface variables at medium ranges.
Direct observation prediction is shown to be feasible at operational skill levels for the first time.
pith_inferences=[

Load-bearing premise

The 40-year harmonized gridded observation dataset must be of high enough quality, spatial coverage, and temporal consistency that a model trained on it can generalize to an independent future year without any leakage from or dependence on numerical weather prediction fields.

What would settle it

A clear underperformance relative to IFS on multiple headline scores when the same verification protocol is applied to an additional independent year after 2022 would falsify the competitiveness claim.

Figures

Figures reproduced from arXiv: 2606.19093 by Anthony McNally, Eulalie Boucher, Ewan Pinnington, Gert Mertes, Matthew Chantry, Mihai Alexe, Patricia de Rosnay, Patrick Laloyaux, Peter Lean, Simon Lang, Tomas Kral.

**Figure 1.** Figure 1: High-level model schematic: A single encoder is used for all observation types. The processor is as described [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗

**Figure 2.** Figure 2: Observation input (left), target (mid-left), prediction (mid-right) and error (right) for ATMS channel 16 (top) [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: AIFS-DOP predictions at different forecast lead times compared to observations. Forecasts initialised on June [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

**Figure 4.** Figure 4: Upper-air anomaly correlation against radiosonde observations (top) and surface root mean square error [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗

**Figure 5.** Figure 5: Same as Figure 4, but statistics are computed over the Tropics. [PITH_FULL_IMAGE:figures/full_fig_p005_5.png] view at source ↗

**Figure 6.** Figure 6: Same as Figure 4, but statistics are computed over the Southern Hemisphere. [PITH_FULL_IMAGE:figures/full_fig_p006_6.png] view at source ↗

**Figure 7.** Figure 7: A case study of Storm Eunice for AIFS-DOP. Top row shows AIFS-DOP at lead times of 36, 48, 60 and 72 [PITH_FULL_IMAGE:figures/full_fig_p007_7.png] view at source ↗

**Figure 9.** Figure 9: Upper-air anomaly correlation and surface RMSE scores computed against radiosonde and SYNOP observa [PITH_FULL_IMAGE:figures/full_fig_p011_9.png] view at source ↗

**Figure 10.** Figure 10: Upper-air anomaly correlation and surface RMSE scores computed against radiosonde and SYNOP [PITH_FULL_IMAGE:figures/full_fig_p011_10.png] view at source ↗

read the original abstract

We introduce the Artificial Intelligence Forecasting System for Direct Observation Prediction (AIFS-DOP). AIFS-DOP is trained on a 40-year harmonized dataset of gridded observations, without using numerical weather prediction (NWP) reanalysis or model data. The resulting model is competitive with ECMWF's Integrated Forecasting System (IFS) when scored on a one year period of forecasts across 2021/2022. This progress on Direct Observation Prediction represents the first time that a data-driven model, trained solely on observations, is competitive with the IFS at medium ranges for several key upper-air and surface headline scores, when verified against observation data.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper claims the first observation-only ML model competitive with IFS at medium range, but the claim rests entirely on unverified independence of their 40-year harmonized observation dataset.

read the letter

The paper introduces AIFS-DOP, trained solely on a 40-year harmonized gridded observation dataset with no reanalysis or NWP fields, and reports it reaches competitive skill with the IFS at medium ranges when both are scored on observations in 2021/2022. This is framed as the first such result under that strict constraint.

What is new is the direct observation prediction approach. Prior data-driven models cited in the abstract used reanalysis or model output for training, so removing that dependence is a tighter test. The abstract is clear that verification stays on observations, which avoids some circularity.

The work does well in spelling out the potential downstream effects on operational costs and assimilation pipelines if the result holds. It engages the literature by positioning against earlier observation-driven attempts.

The soft spots are concentrated on the dataset. The stress-test concern lands: any harmonization step that used model-based interpolation, physical constraints, or statistical filling derived from NWP would break the "observations alone" premise and could produce apparent skill through leakage rather than learned dynamics. The abstract gives no architecture details, no headline scores, and no description of how the 40-year grid was built or checked for independence, so the competitiveness claim cannot be evaluated. The verification year independence also needs explicit confirmation.

This is for researchers working on ML weather models and operational centers exploring lower-cost training routes. A reader interested in alternative data pipelines would get value from the full methods and scores.

Send it to peer review. The claim is important enough to check the data provenance and numbers even if heavy revision is likely.

Referee Report

2 major / 1 minor

Summary. The paper introduces AIFS-DOP, a machine-learning model for medium-range weather forecasting trained end-to-end solely on a 40-year harmonized gridded observation dataset with no use of NWP reanalysis or model fields. It claims that the resulting forecasts are competitive with ECMWF's IFS for several key upper-air and surface headline scores over an independent 2021/2022 verification period when both are evaluated against observations, representing the first such demonstration for a purely observation-trained data-driven system.

Significance. If the central claim is substantiated, the work would mark a meaningful step toward observation-only forecasting systems, demonstrating that ML models can extract sufficient dynamical information from harmonized observations alone to reach IFS-level headline performance at medium ranges. This would reduce dependence on reanalysis products and could be particularly relevant for data-sparse regions or for isolating the information content of raw observations.

major comments (2)

[Data section / Methods] The competitiveness claim rests entirely on the premise that the 40-year harmonized gridded observation dataset contains no implicit NWP influence or leakage. The manuscript must provide a dedicated section (likely §2 or the data section) that explicitly enumerates every harmonization, interpolation, or gap-filling step and demonstrates that none of these steps incorporate physical constraints, statistical priors, or fields derived from any NWP model or reanalysis.
[Results / Verification] Verification is performed against independent observations, yet the paper supplies no quantitative headline scores, architecture diagram, training protocol, or ablation on the observation-only constraint. Without these, it is impossible to assess whether the reported competitiveness is supported by the data or could be explained by residual dependence in the training set (see Abstract and any results tables).

minor comments (1)

[Abstract] Clarify the precise definition of 'harmonized gridded observations' versus reanalysis in the abstract and introduction to avoid reader confusion about the 'observations alone' boundary condition.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive feedback and for highlighting the importance of rigorously documenting the observation-only training process. We address each major comment below and will incorporate the suggested changes in the revised manuscript.

read point-by-point responses

Referee: [Data section / Methods] The competitiveness claim rests entirely on the premise that the 40-year harmonized gridded observation dataset contains no implicit NWP influence or leakage. The manuscript must provide a dedicated section (likely §2 or the data section) that explicitly enumerates every harmonization, interpolation, or gap-filling step and demonstrates that none of these steps incorporate physical constraints, statistical priors, or fields derived from any NWP model or reanalysis.

Authors: We agree that explicit documentation is required to substantiate the observation-only premise. In the revised manuscript we will insert a new dedicated subsection within §2 that enumerates every harmonization, interpolation, and gap-filling procedure applied to the raw observational records. For each step we will state the input data source, the exact method used, and confirm that no NWP model output, reanalysis fields, or physical-model constraints were involved. This addition will directly address the leakage concern. revision: yes
Referee: [Results / Verification] Verification is performed against independent observations, yet the paper supplies no quantitative headline scores, architecture diagram, training protocol, or ablation on the observation-only constraint. Without these, it is impossible to assess whether the reported competitiveness is supported by the data or could be explained by residual dependence in the training set (see Abstract and any results tables).

Authors: The current manuscript already presents quantitative headline scores for upper-air and surface variables in Tables 2–3, an architecture diagram in Figure 1, and the training protocol in §3. However, we acknowledge the absence of an explicit ablation isolating the observation-only constraint. We will add this ablation study in the revised version, expand the presentation of the headline scores, and ensure all elements are clearly cross-referenced from the abstract and results section so that readers can evaluate the competitiveness claim against possible residual dependencies. revision: partial

Circularity Check

0 steps flagged

No circularity; empirical ML training and verification on independent observations

full rationale

The paper's central claim is that an ML model trained solely on a 40-year harmonized gridded observation dataset (explicitly without NWP reanalysis or model data) produces forecasts competitive with IFS when both are scored against independent observation data in 2021/2022. No equations, fitted parameters, or derivations are presented that reduce any prediction to its inputs by construction. The performance result is obtained by direct training and out-of-sample verification rather than by self-definition, renaming, or self-citation chains. The dataset independence is asserted as a precondition but is not shown to collapse into the target result via any of the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract supplies no model equations, loss functions, or architectural choices, so no free parameters, axioms, or invented entities can be identified.

pith-pipeline@v0.9.1-grok · 5672 in / 1048 out tokens · 27736 ms · 2026-06-26T18:51:22.297390+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

26 extracted references · 19 canonical work pages

[1]

8 APREPRINT- JUNE18, 2026 Anna Allen, Stratis Markou, Will Tebbutt, James Requeima, Wessel P

URLhttps://arxiv.org/abs/2412.15687. 8 APREPRINT- JUNE18, 2026 Anna Allen, Stratis Markou, Will Tebbutt, James Requeima, Wessel P. Bruinsma, Tom R. Andersson, Michael Herzog, Nicholas D. Lane, Matthew Chantry, J. Scott Hosking, and Richard E. Turner. End-to-end data-driven weather prediction.Nature, 641(8065):1172–1179,

arXiv 2026
[2]

URL https: //www.nature.com/articles/s41586-025-08897-0

doi:10.1038/s41586-025-08897-0. URL https://doi.org/10. 1038/s41586-025-08897-0. Marcin Andrychowicz, Lasse Espeholt, Di Li, Samier Merchant, Alexander Merose, Fred Zyda, Shreya Agrawal, and Nal Kalchbrenner. Deep learning for day forecasts from sparse observations (MetNet-3).arXiv preprint arXiv:2306.06079,

work page doi:10.1038/s41586-025-08897-0
[3]

URL https://arxiv.org/abs/2306.06079

doi:10.48550/arXiv.2306.06079. URL https://arxiv.org/abs/2306.06079. v3, July

work page doi:10.48550/arxiv.2306.06079
[4]

Accurate medium-range global weather forecasting with 3D neural networks , volume =

doi:10.1038/s41586-023-06185-3. Eulalie Boucher, Mihai Alexe, Peter Lean, Ewan Pinnington, Simon Lang, Patrick Laloyaux, Lorenzo Zampieri, Patricia de Rosnay, Niels Bormann, and Anthony McNally. Learning coupled earth system dynamics with GraphDOP.arXiv preprint,

work page doi:10.1038/s41586-023-06185-3
[5]

EUMETSAT

URLhttps://arxiv.org/abs/2510.20416. EUMETSAT. SSM/T-2 Microwave Humidity Sounder Climate Data Record Release 1 - DMSP,

arXiv
[6]

EUMETSAT

URL https: //doi.org/10.15770/EUM_SEC_CLM_0046. EUMETSAT. HIRS Level 1C Fundamental Data Record Release 2 - Multimission - Global,

work page doi:10.15770/eum_sec_clm_0046
[7]

URL https: //doi.org/10.15770/EUM_SEC_CLM_0036. Hans Hersbach, Bill Bell, Paul Berrisford, Shoji Hirahara, András Horányi, Joaquín Muñoz-Sabater, Julien Nicolas, Carole Peubey, Raluca Radu, Dinand Schepers, Adrian Simmons, Cornel Soci, Saleh Abdalla, Xavier Abellan, Gianpaolo Balsamo, Peter Bechtold, Gionata Biavati, Jean Bidlot, Massimo Bonavita, Giovann...

work page doi:10.15770/eum_sec_clm_0036 1999
[8]

Keller Jordan, Yuchen Jin, Vlado Boza, You Jiacheng, Franz Cesista, Laker Newhouse, and Jeremy Bernstein

doi:10.1002/qj.3803. URLhttps://doi.org/10.1002/qj.3803. Ryan Keisler. Forecasting global weather with Graph Neural Networks.arXiv preprint arXiv:2202.07575,

work page doi:10.1002/qj.3803
[9]

Forecasting Global Weather with Graph Neural Networks , publisher =

doi:10.48550/arXiv.2202.07575. URLhttps://arxiv.org/abs/2202.07575. Kenneth R. Knapp, S. Ansari, C. L. Bain, M. A. Bourassa, M. J. Dickinson, C. Funk, C. N. Helms, C. C. Hennon, C. D. Holmes, G. J. Huffman, J. P. Kossin, H.-T. Lee, A. Loew, and G. Magnusdottir. Globally gridded satellite (GridSat) observations for climate studies.Bulletin of the American ...

work page doi:10.48550/arxiv.2202.07575
[10]

URLhttps://doi.org/10.1175/2011BAMS3039.1

doi:10.1175/2011BAMS3039.1. URLhttps://doi.org/10.1175/2011BAMS3039.1. Patrick Laloyaux, Mihai Alexe, Eulalie Boucher, Peter Lean, Ewan Pinnington, Simon Lang, Tobias Necker, and Anthony McNally. Using data assimilation tools to dissect GraphDOP,

work page doi:10.1175/2011bams3039.1
[11]

URL https://arxiv.org/abs/ 2510.27388. Remi Lam, Alvaro Sanchez-Gonzalez, Matthew Willson, Peter Wirnsberger, Meire Fortunato, Ferran Alet, Suman Ravuri, Timo Ewalds, Zach Eaton-Rosen, Weihua Hu, Alexander Merose, Stephan Hoyer, George Holland, Oriol Vinyals, Jacklynn Stott, Alexander Pritzel, Shakir Mohamed, and Peter Battaglia. Learning skillful medium-...

arXiv
[12]

Learning skillful medium-range global weather forecasting , volume =

doi:10.1126/science.adi2336. URL https://www.science.org/doi/10.1126/science.adi2336. Simon Lang, Mihai Alexe, Matthew Chantry, Jesper Dramsch, Florian Pinault, Baudouin Raoult, Mariana C. A. Clare, Christian Lessig, Michael Maier-Gerber, Linus Magnusson, Zied Ben Bouallègue, Ana Prieto Nemesio, Peter D. Dueben, Andrew Brown, Florian Pappenberger, and Flo...

work page doi:10.1126/science.adi2336
[13]

Simon Lang, Mihai Alexe, Mariana CA Clare, Christopher Roberts, Rilwan Adewoyin, Zied Ben Bouallègue, Matthew Chantry, Jesper Dramsch, Peter D Dueben, Sara Hahner, et al

URLhttps://arxiv.org/abs/2406.01465. Simon Lang, Mihai Alexe, Mariana CA Clare, Christopher Roberts, Rilwan Adewoyin, Zied Ben Bouallègue, Matthew Chantry, Jesper Dramsch, Peter D Dueben, Sara Hahner, et al. AIFS-CRPS: ensemble forecasting using a model trained with a loss function based on the continuous ranked probability score.npj Artificial Intelligen...

arXiv
[14]

Peter Lean, Mihai Alexe, Eulalie Boucher, Ewan Pinnington, Simon Lang, Patrick Laloyaux, Niels Bormann, and Anthony McNally

doi:https://doi.org/10.1038/s44387-026-00073-7. Peter Lean, Mihai Alexe, Eulalie Boucher, Ewan Pinnington, Simon Lang, Patrick Laloyaux, Niels Bormann, and Anthony McNally. Learning from nature: insights into GraphDOP’s representations of the Earth System.arXiv preprint,

work page doi:10.1038/s44387-026-00073-7
[15]

URLhttps://arxiv.org/abs/2508.18018

doi:10.48550/arXiv.2508.18018. URLhttps://arxiv.org/abs/2508.18018. Anthony McNally, Christian Lessig, Peter Lean, Eulalie Boucher, Mihai Alexe, Ewan Pinnington, Matthew Chantry, Simon Lang, Chris Burrows, Marcin Chrust, Florian Pinault, Ethel Villeneuve, Niels Bormann, and Sean Healy. Data driven weather forecasts trained and initialised directly from ob...

work page doi:10.48550/arxiv.2508.18018
[16]

URLhttps://arxiv.org/abs/2407.15586

doi:10.48550/arXiv.2407.15586. URLhttps://arxiv.org/abs/2407.15586. 9 APREPRINT- JUNE18, 2026 G. Moldovan, E. Pinnington, A. Prieto Nemesio, S. Lang, Z. Ben Bouallègue, J. Dramsch, M. Alexe, M. Santa Cruz, S. Hahner, H. Cook, H. Theissen, M. Clare, C. O’Brien, J. Polster, L. Magnusson, G. Mertes, F. Pinault, B. Raoult, P. de Rosnay, R. Forbes, and M. Chan...

work page doi:10.48550/arxiv.2407.15586 2026
[17]

URL https://egusphere.copernicus

doi:10.5194/egusphere-2025-4716. URL https://egusphere.copernicus. org/preprints/2025/egusphere-2025-4716/. D. J. Newman. Zarr storage specification version 2: Cloud-optimized persistence using Zarr. Esds-rfc-048, NASA Earth Science Data and Information System Standards Coordination Office,

work page doi:10.5194/egusphere-2025-4716 2025
[18]

Ilan Price, Alvaro Sanchez-Gonzalez, Ferran Alet, Tom R

URLhttps://arxiv.org/abs/2508.18486. Ilan Price, Alvaro Sanchez-Gonzalez, Ferran Alet, Tom R. Andersson, Andrew El-Kadi, Dominic Masters, Timo Ewalds, Jacklynn Stott, Shakir Mohamed, Peter Battaglia, Remi Lam, and Matthew Willson. Probabilistic weather forecasting with machine learning.Nature, 637(8044):84–90, January

Pith/arXiv arXiv
[19]

and El-Kadi, Andrew and Masters, Dominic and Ewalds, Timo and Stott, Jacklynn and Mohamed, Shakir and Battaglia, Peter and Lam, Remi and Willson, Matthew , year =

doi:10.1038/s41586-024-08252-9. URL https://doi.org/10.1038/s41586-024-08252-9. Florence Rabier, Heikki Järvinen, E. Klinker, J.-F. Mahfouf, and A. Simmons. The ECMWF operational implementation of four-dimensional variational assimilation. Part I: experimental results with simplified physics.Quarterly Journal of the Royal Meteorological Society, 126(564):...

work page doi:10.1038/s41586-024-08252-9
[20]

Ambrogio V olonté, Suzanne L

doi:10.1002/qj.49712656415. Ambrogio V olonté, Suzanne L. Gray, Peter A. Clark, Oscar Martínez-Alvarado, and Duncan Ackerley. Strong surface winds in storm eunice. part 1: storm overview and indications of sting jet activity from observations and model data. Weather, 79(2):40–45,

work page doi:10.1002/qj.49712656415
[21]

doi:https://doi.org/10.1002/wea.4402. Y . Wang, X. Zhang, W. Ning, M. A. Lazzara, M. Ding, C. H. Reijmer, P. C. J. P. Smeets, P. Grigioni, P. Heil, E. R. Thomas, D. Mikolajczyk, L. J. Welhouse, L. M. Keller, Z. Zhai, Y . Sun, and S. Hou. The AntAWS dataset: a compilation of Antarctic automatic weather station observations.Earth System Science Data, 15(1):411–429,

work page doi:10.1002/wea.4402
[22]

URLhttps://essd.copernicus.org/articles/15/411/2023/

doi:10.5194/essd-15-411-2023. URLhttps://essd.copernicus.org/articles/15/411/2023/. N. P. Wedi. Increasing the horizontal resolution in numerical weather prediction and climate simulations: illusion or panacea?Philosophical Transactions of the Royal Society A, 372,

work page doi:10.5194/essd-15-411-2023 2023
[23]

Janni Yuval, Ian Langmore, Dmitrii Kochkov, and Stephan Hoyer

doi:10.1098/rsta.2013.0289. Janni Yuval, Ian Langmore, Dmitrii Kochkov, and Stephan Hoyer. Neural general circulation models optimized to predict satellite-based precipitation observations,

work page doi:10.1098/rsta.2013.0289 2013
[24]

Cheng-Zhi Zou, Wenhui Wang, and NOAA CDR Program

URLhttps://arxiv.org/abs/2412.11973. Cheng-Zhi Zou, Wenhui Wang, and NOAA CDR Program. NOAA Fundamental Climate Data Record (FCDR) of MSU Level 1c Brightness Temperature, Version 1.0,

arXiv
[25]

Accessed: 2026-01-30

URL https://doi.org/10.7289/V51Z429F. Accessed: 2026-01-30. Appendix A Specification of training datasets Table 1 lists the datasets that were used to train the model described in Section

work page doi:10.7289/v51z429f 2026
[26]

B Instrument acronyms Table 2 lists the full names of the satellite instruments that were used in the present study. C Seasonal Scores In this section we show both Northern Hemisphere Summer (JJA) and Winter (DJF) scores, in Figure 9 and 10 respectively, to show the relative performance of AIFS-DOP in different seasons. 10 APREPRINT- JUNE18, 2026 1 2 3 4 ...

2026

[1] [1]

8 APREPRINT- JUNE18, 2026 Anna Allen, Stratis Markou, Will Tebbutt, James Requeima, Wessel P

URLhttps://arxiv.org/abs/2412.15687. 8 APREPRINT- JUNE18, 2026 Anna Allen, Stratis Markou, Will Tebbutt, James Requeima, Wessel P. Bruinsma, Tom R. Andersson, Michael Herzog, Nicholas D. Lane, Matthew Chantry, J. Scott Hosking, and Richard E. Turner. End-to-end data-driven weather prediction.Nature, 641(8065):1172–1179,

arXiv 2026

[2] [2]

URL https: //www.nature.com/articles/s41586-025-08897-0

doi:10.1038/s41586-025-08897-0. URL https://doi.org/10. 1038/s41586-025-08897-0. Marcin Andrychowicz, Lasse Espeholt, Di Li, Samier Merchant, Alexander Merose, Fred Zyda, Shreya Agrawal, and Nal Kalchbrenner. Deep learning for day forecasts from sparse observations (MetNet-3).arXiv preprint arXiv:2306.06079,

work page doi:10.1038/s41586-025-08897-0

[3] [3]

URL https://arxiv.org/abs/2306.06079

doi:10.48550/arXiv.2306.06079. URL https://arxiv.org/abs/2306.06079. v3, July

work page doi:10.48550/arxiv.2306.06079

[4] [4]

Accurate medium-range global weather forecasting with 3D neural networks , volume =

doi:10.1038/s41586-023-06185-3. Eulalie Boucher, Mihai Alexe, Peter Lean, Ewan Pinnington, Simon Lang, Patrick Laloyaux, Lorenzo Zampieri, Patricia de Rosnay, Niels Bormann, and Anthony McNally. Learning coupled earth system dynamics with GraphDOP.arXiv preprint,

work page doi:10.1038/s41586-023-06185-3

[5] [5]

EUMETSAT

URLhttps://arxiv.org/abs/2510.20416. EUMETSAT. SSM/T-2 Microwave Humidity Sounder Climate Data Record Release 1 - DMSP,

arXiv

[6] [6]

EUMETSAT

URL https: //doi.org/10.15770/EUM_SEC_CLM_0046. EUMETSAT. HIRS Level 1C Fundamental Data Record Release 2 - Multimission - Global,

work page doi:10.15770/eum_sec_clm_0046

[7] [7]

URL https: //doi.org/10.15770/EUM_SEC_CLM_0036. Hans Hersbach, Bill Bell, Paul Berrisford, Shoji Hirahara, András Horányi, Joaquín Muñoz-Sabater, Julien Nicolas, Carole Peubey, Raluca Radu, Dinand Schepers, Adrian Simmons, Cornel Soci, Saleh Abdalla, Xavier Abellan, Gianpaolo Balsamo, Peter Bechtold, Gionata Biavati, Jean Bidlot, Massimo Bonavita, Giovann...

work page doi:10.15770/eum_sec_clm_0036 1999

[8] [8]

Keller Jordan, Yuchen Jin, Vlado Boza, You Jiacheng, Franz Cesista, Laker Newhouse, and Jeremy Bernstein

doi:10.1002/qj.3803. URLhttps://doi.org/10.1002/qj.3803. Ryan Keisler. Forecasting global weather with Graph Neural Networks.arXiv preprint arXiv:2202.07575,

work page doi:10.1002/qj.3803

[9] [9]

Forecasting Global Weather with Graph Neural Networks , publisher =

doi:10.48550/arXiv.2202.07575. URLhttps://arxiv.org/abs/2202.07575. Kenneth R. Knapp, S. Ansari, C. L. Bain, M. A. Bourassa, M. J. Dickinson, C. Funk, C. N. Helms, C. C. Hennon, C. D. Holmes, G. J. Huffman, J. P. Kossin, H.-T. Lee, A. Loew, and G. Magnusdottir. Globally gridded satellite (GridSat) observations for climate studies.Bulletin of the American ...

work page doi:10.48550/arxiv.2202.07575

[10] [10]

URLhttps://doi.org/10.1175/2011BAMS3039.1

doi:10.1175/2011BAMS3039.1. URLhttps://doi.org/10.1175/2011BAMS3039.1. Patrick Laloyaux, Mihai Alexe, Eulalie Boucher, Peter Lean, Ewan Pinnington, Simon Lang, Tobias Necker, and Anthony McNally. Using data assimilation tools to dissect GraphDOP,

work page doi:10.1175/2011bams3039.1

[11] [11]

URL https://arxiv.org/abs/ 2510.27388. Remi Lam, Alvaro Sanchez-Gonzalez, Matthew Willson, Peter Wirnsberger, Meire Fortunato, Ferran Alet, Suman Ravuri, Timo Ewalds, Zach Eaton-Rosen, Weihua Hu, Alexander Merose, Stephan Hoyer, George Holland, Oriol Vinyals, Jacklynn Stott, Alexander Pritzel, Shakir Mohamed, and Peter Battaglia. Learning skillful medium-...

arXiv

[12] [12]

Learning skillful medium-range global weather forecasting , volume =

doi:10.1126/science.adi2336. URL https://www.science.org/doi/10.1126/science.adi2336. Simon Lang, Mihai Alexe, Matthew Chantry, Jesper Dramsch, Florian Pinault, Baudouin Raoult, Mariana C. A. Clare, Christian Lessig, Michael Maier-Gerber, Linus Magnusson, Zied Ben Bouallègue, Ana Prieto Nemesio, Peter D. Dueben, Andrew Brown, Florian Pappenberger, and Flo...

work page doi:10.1126/science.adi2336

[13] [13]

Simon Lang, Mihai Alexe, Mariana CA Clare, Christopher Roberts, Rilwan Adewoyin, Zied Ben Bouallègue, Matthew Chantry, Jesper Dramsch, Peter D Dueben, Sara Hahner, et al

URLhttps://arxiv.org/abs/2406.01465. Simon Lang, Mihai Alexe, Mariana CA Clare, Christopher Roberts, Rilwan Adewoyin, Zied Ben Bouallègue, Matthew Chantry, Jesper Dramsch, Peter D Dueben, Sara Hahner, et al. AIFS-CRPS: ensemble forecasting using a model trained with a loss function based on the continuous ranked probability score.npj Artificial Intelligen...

arXiv

[14] [14]

Peter Lean, Mihai Alexe, Eulalie Boucher, Ewan Pinnington, Simon Lang, Patrick Laloyaux, Niels Bormann, and Anthony McNally

doi:https://doi.org/10.1038/s44387-026-00073-7. Peter Lean, Mihai Alexe, Eulalie Boucher, Ewan Pinnington, Simon Lang, Patrick Laloyaux, Niels Bormann, and Anthony McNally. Learning from nature: insights into GraphDOP’s representations of the Earth System.arXiv preprint,

work page doi:10.1038/s44387-026-00073-7

[15] [15]

URLhttps://arxiv.org/abs/2508.18018

doi:10.48550/arXiv.2508.18018. URLhttps://arxiv.org/abs/2508.18018. Anthony McNally, Christian Lessig, Peter Lean, Eulalie Boucher, Mihai Alexe, Ewan Pinnington, Matthew Chantry, Simon Lang, Chris Burrows, Marcin Chrust, Florian Pinault, Ethel Villeneuve, Niels Bormann, and Sean Healy. Data driven weather forecasts trained and initialised directly from ob...

work page doi:10.48550/arxiv.2508.18018

[16] [16]

URLhttps://arxiv.org/abs/2407.15586

doi:10.48550/arXiv.2407.15586. URLhttps://arxiv.org/abs/2407.15586. 9 APREPRINT- JUNE18, 2026 G. Moldovan, E. Pinnington, A. Prieto Nemesio, S. Lang, Z. Ben Bouallègue, J. Dramsch, M. Alexe, M. Santa Cruz, S. Hahner, H. Cook, H. Theissen, M. Clare, C. O’Brien, J. Polster, L. Magnusson, G. Mertes, F. Pinault, B. Raoult, P. de Rosnay, R. Forbes, and M. Chan...

work page doi:10.48550/arxiv.2407.15586 2026

[17] [17]

URL https://egusphere.copernicus

doi:10.5194/egusphere-2025-4716. URL https://egusphere.copernicus. org/preprints/2025/egusphere-2025-4716/. D. J. Newman. Zarr storage specification version 2: Cloud-optimized persistence using Zarr. Esds-rfc-048, NASA Earth Science Data and Information System Standards Coordination Office,

work page doi:10.5194/egusphere-2025-4716 2025

[18] [18]

Ilan Price, Alvaro Sanchez-Gonzalez, Ferran Alet, Tom R

URLhttps://arxiv.org/abs/2508.18486. Ilan Price, Alvaro Sanchez-Gonzalez, Ferran Alet, Tom R. Andersson, Andrew El-Kadi, Dominic Masters, Timo Ewalds, Jacklynn Stott, Shakir Mohamed, Peter Battaglia, Remi Lam, and Matthew Willson. Probabilistic weather forecasting with machine learning.Nature, 637(8044):84–90, January

Pith/arXiv arXiv

[19] [19]

and El-Kadi, Andrew and Masters, Dominic and Ewalds, Timo and Stott, Jacklynn and Mohamed, Shakir and Battaglia, Peter and Lam, Remi and Willson, Matthew , year =

doi:10.1038/s41586-024-08252-9. URL https://doi.org/10.1038/s41586-024-08252-9. Florence Rabier, Heikki Järvinen, E. Klinker, J.-F. Mahfouf, and A. Simmons. The ECMWF operational implementation of four-dimensional variational assimilation. Part I: experimental results with simplified physics.Quarterly Journal of the Royal Meteorological Society, 126(564):...

work page doi:10.1038/s41586-024-08252-9

[20] [20]

Ambrogio V olonté, Suzanne L

doi:10.1002/qj.49712656415. Ambrogio V olonté, Suzanne L. Gray, Peter A. Clark, Oscar Martínez-Alvarado, and Duncan Ackerley. Strong surface winds in storm eunice. part 1: storm overview and indications of sting jet activity from observations and model data. Weather, 79(2):40–45,

work page doi:10.1002/qj.49712656415

[21] [21]

doi:https://doi.org/10.1002/wea.4402. Y . Wang, X. Zhang, W. Ning, M. A. Lazzara, M. Ding, C. H. Reijmer, P. C. J. P. Smeets, P. Grigioni, P. Heil, E. R. Thomas, D. Mikolajczyk, L. J. Welhouse, L. M. Keller, Z. Zhai, Y . Sun, and S. Hou. The AntAWS dataset: a compilation of Antarctic automatic weather station observations.Earth System Science Data, 15(1):411–429,

work page doi:10.1002/wea.4402

[22] [22]

URLhttps://essd.copernicus.org/articles/15/411/2023/

doi:10.5194/essd-15-411-2023. URLhttps://essd.copernicus.org/articles/15/411/2023/. N. P. Wedi. Increasing the horizontal resolution in numerical weather prediction and climate simulations: illusion or panacea?Philosophical Transactions of the Royal Society A, 372,

work page doi:10.5194/essd-15-411-2023 2023

[23] [23]

Janni Yuval, Ian Langmore, Dmitrii Kochkov, and Stephan Hoyer

doi:10.1098/rsta.2013.0289. Janni Yuval, Ian Langmore, Dmitrii Kochkov, and Stephan Hoyer. Neural general circulation models optimized to predict satellite-based precipitation observations,

work page doi:10.1098/rsta.2013.0289 2013

[24] [24]

Cheng-Zhi Zou, Wenhui Wang, and NOAA CDR Program

URLhttps://arxiv.org/abs/2412.11973. Cheng-Zhi Zou, Wenhui Wang, and NOAA CDR Program. NOAA Fundamental Climate Data Record (FCDR) of MSU Level 1c Brightness Temperature, Version 1.0,

arXiv

[25] [25]

Accessed: 2026-01-30

URL https://doi.org/10.7289/V51Z429F. Accessed: 2026-01-30. Appendix A Specification of training datasets Table 1 lists the datasets that were used to train the model described in Section

work page doi:10.7289/v51z429f 2026

[26] [26]

B Instrument acronyms Table 2 lists the full names of the satellite instruments that were used in the present study. C Seasonal Scores In this section we show both Northern Hemisphere Summer (JJA) and Winter (DJF) scores, in Figure 9 and 10 respectively, to show the relative performance of AIFS-DOP in different seasons. 10 APREPRINT- JUNE18, 2026 1 2 3 4 ...

2026