Njord: A Probabilistic Graph Neural Network for Ensemble Ocean Forecasting

Daniel Holmberg; Erik Larsson; Fredrik Lindsten; Joel Oskarsson; Teemu Roos

arxiv: 2605.15470 · v1 · pith:JCS2PZMKnew · submitted 2026-05-14 · 💻 cs.LG · physics.ao-ph

Njord: A Probabilistic Graph Neural Network for Ensemble Ocean Forecasting

Daniel Holmberg , Joel Oskarsson , Erik Larsson , Fredrik Lindsten , Teemu Roos This is my paper

Pith reviewed 2026-05-19 14:39 UTC · model grok-4.3

classification 💻 cs.LG physics.ao-ph

keywords ocean forecastinggraph neural networksprobabilistic modelsensemble predictionmachine learninguncertainty estimationocean dynamics

0 comments

The pith

A probabilistic graph neural network for ocean forecasting achieves the lowest errors on a global benchmark while providing uncertainty estimates.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces Njord, a model that combines deep latent variables with graph neural networks to generate probabilistic ensemble forecasts for ocean dynamics in both global and regional settings. This approach allows sampling multiple forecasts in a single forward pass, unlike deterministic machine learning models that ignore the chaotic nature of ocean systems. To handle large irregular grids, the model uses K-means cluster meshes that adapt to sea surface geometry at 0.25 degree global and 2 km regional resolutions. On the OceanBench benchmark against real observations, Njord records the lowest average errors across upper-ocean variables, with the biggest gains in surface temperature prediction.

Core claim

Njord integrates a deep latent variable framework with a graph neural network architecture on K-means cluster meshes, enabling single-pass sampling of ensemble forecasts that outperform deterministic baselines on upper-ocean variables while supplying uncertainty estimates from the ensembles.

What carries the argument

K-means cluster meshes adapted to irregular sea surface geometry, combined with a deep latent variable model that supports efficient probabilistic sampling within the graph neural network.

Load-bearing premise

K-means cluster meshes adapt sufficiently well to irregular sea-surface geometry to allow accurate and efficient scaling of the graph neural network to global 0.25-degree and regional 2 km grids.

What would settle it

Demonstrating that a competing model produces lower average errors than Njord across upper-ocean variables on the OceanBench benchmark when validated against real-world observations would undermine the performance advantage.

Figures

Figures reproduced from arXiv: 2605.15470 by Daniel Holmberg, Erik Larsson, Fredrik Lindsten, Joel Oskarsson, Teemu Roos.

**Figure 1.** Figure 1: Njord. at global short-range (1–10 days) timescales. These models are however, deterministic: they produce a single trajectory and are typically trained with mean squared error, which encourages predictions toward the conditional mean of the future state rather than capturing the full predictive distribution. Consequently, they tend to smooth over fine-scale variance and offer limited insight into the pro… view at source ↗

**Figure 2.** Figure 2: One-step prediction in the Njord model. Residuals are predicted at time [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: Example of graph node placement in the Red Sea. 4.1 A graph adapted to ocean geometry Graph-based global weather forecasting models use icosahedral meshes [30, 9, 31] for constructing the spatial graph that the model operates over. These meshes are constructed by iteratively subdividing an icosahedron, with each subdivision quadrupling the number of nodes and edges [30]. As the size of the graph heavily … view at source ↗

**Figure 4.** Figure 4: RMSE for Sea Surface Temperature (SST), Sea Surface Height (SSH), Sea Surface Salin [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗

**Figure 5.** Figure 5: SSR averaged over all global ocean variables. The Spread-Skill Ratio (SSR) in [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗

**Figure 6.** Figure 6: Global SST at a 10 d lead, initialized on 2024-01-30. Ground truth is GLO12 analysis. [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗

**Figure 7.** Figure 7: Arctic SIT at 10 d lead time, initialized 2024-01-30. Ground truth is GLO12 analysis. [PITH_FULL_IMAGE:figures/full_fig_p008_7.png] view at source ↗

**Figure 8.** Figure 8: Global SST predictions evaluated on satellite measurements. To further evaluate SST forecasts outside of OceanBench, we compare the predicted potential temperature of the uppermost ocean layer against a global ocean bias-adjusted SST product [42], based on multi-sensor satellite observations [PITH_FULL_IMAGE:figures/full_fig_p008_8.png] view at source ↗

**Figure 10.** Figure 10: RMSE for Temperature (T), Salinity (S), Zonal Current (U) at 47 m depth, as well as Sea [PITH_FULL_IMAGE:figures/full_fig_p009_10.png] view at source ↗

**Figure 11.** Figure 11: Baltic Sea SST at 10 d lead time, initialized 2024-03-05. Ground truth is NEMO analysis. [PITH_FULL_IMAGE:figures/full_fig_p009_11.png] view at source ↗

**Figure 9.** Figure 9: SSR averaged over Baltic Sea variables. Across variables, Njord-Baltic achieves RMSE values comparable to SeaCast while providing probabilistic forecasts. In this regional setting, GLO12 exhibits a relatively flat error curve, similar to a climatological baseline. Both Njord-Baltic and SeaCast clearly outperform persistence. Njord-Baltic matches SeaCast in deterministic accuracy while additionally provi… view at source ↗

**Figure 12.** Figure 12: One-step prediction in the Njord-Baltic model. Residuals are predicted at time [PITH_FULL_IMAGE:figures/full_fig_p018_12.png] view at source ↗

**Figure 13.** Figure 13: Global graphs used by Njord, with grid nodes in blue, encoding/decoding edges in black, [PITH_FULL_IMAGE:figures/full_fig_p019_13.png] view at source ↗

**Figure 14.** Figure 14: Regional graphs used by Njord, with grid nodes in blue, M2G and G2M edges in black, [PITH_FULL_IMAGE:figures/full_fig_p020_14.png] view at source ↗

**Figure 15.** Figure 15: Example of mesh node placement in the Gulf of California (latitude [PITH_FULL_IMAGE:figures/full_fig_p021_15.png] view at source ↗

**Figure 16.** Figure 16: Example of mesh node placement in the northern Red Sea and Suez Canal (latitude [PITH_FULL_IMAGE:figures/full_fig_p021_16.png] view at source ↗

**Figure 17.** Figure 17: Example of mesh node placement in the Bråviken bay and Östergötland Archipelago, on [PITH_FULL_IMAGE:figures/full_fig_p022_17.png] view at source ↗

**Figure 18.** Figure 18: Example of mesh node placement in the Turku Archipelago in south-western Finland. [PITH_FULL_IMAGE:figures/full_fig_p022_18.png] view at source ↗

**Figure 19.** Figure 19: Ensemble mean CRPS scorecards. The heatmaps display the relative difference between [PITH_FULL_IMAGE:figures/full_fig_p026_19.png] view at source ↗

**Figure 20.** Figure 20: The heatmaps display the relative difference in RMSE and CRPS between Njord trained [PITH_FULL_IMAGE:figures/full_fig_p027_20.png] view at source ↗

**Figure 21.** Figure 21: Spatial evaluation of SIC at a 30-day lead time. The panels compare the ground truth [PITH_FULL_IMAGE:figures/full_fig_p029_21.png] view at source ↗

**Figure 22.** Figure 22: Log-scaled scatter density heatmaps evaluating predicted versus observed SIC and SIT at [PITH_FULL_IMAGE:figures/full_fig_p030_22.png] view at source ↗

**Figure 23.** Figure 23: Ensemble mean CRPS scorecards. The heatmaps display the relative difference between [PITH_FULL_IMAGE:figures/full_fig_p030_23.png] view at source ↗

**Figure 24.** Figure 24: Global RMSE of SST by forecast lead time, where Njord has the lowest error compared to satellite measurements. The dataset merges multi-sensor satellite observations into a Level-3 global grid [PITH_FULL_IMAGE:figures/full_fig_p035_24.png] view at source ↗

**Figure 25.** Figure 25: Spatial distribution of normalized RMSE difference for SST between Njord ensemble [PITH_FULL_IMAGE:figures/full_fig_p035_25.png] view at source ↗

**Figure 26.** Figure 26: Surface variables: SSH, SIC, and SIT. Columns from left to right show RMSE, CRPS, [PITH_FULL_IMAGE:figures/full_fig_p036_26.png] view at source ↗

**Figure 27.** Figure 27: Temperature at six different depths. Columns from left to right show RMSE, CRPS, and [PITH_FULL_IMAGE:figures/full_fig_p037_27.png] view at source ↗

**Figure 28.** Figure 28: Salinity at six different depths. Columns from left to right show RMSE, CRPS, and SSR. [PITH_FULL_IMAGE:figures/full_fig_p038_28.png] view at source ↗

**Figure 29.** Figure 29: Zonal current at six different depths. Columns from left to right show RMSE, CRPS, and [PITH_FULL_IMAGE:figures/full_fig_p039_29.png] view at source ↗

**Figure 30.** Figure 30: Normalized RMSE difference for various variables and depth levels, comparing ensemble [PITH_FULL_IMAGE:figures/full_fig_p040_30.png] view at source ↗

**Figure 31.** Figure 31: Sea ice concentration at lead time 10 d, init 2024-12-24. [PITH_FULL_IMAGE:figures/full_fig_p041_31.png] view at source ↗

**Figure 32.** Figure 32: Sea ice thickness at lead time 10 d, init 2024-12-24. [PITH_FULL_IMAGE:figures/full_fig_p042_32.png] view at source ↗

**Figure 33.** Figure 33: Temperature at the surface, lead time 10 d, init 2024-12-24. [PITH_FULL_IMAGE:figures/full_fig_p042_33.png] view at source ↗

**Figure 34.** Figure 34: Salinity at the surface, lead time 10 d, init 2024-12-24. [PITH_FULL_IMAGE:figures/full_fig_p043_34.png] view at source ↗

**Figure 35.** Figure 35: Zonal current at the surface, lead time 10 d, init 2024-12-24. [PITH_FULL_IMAGE:figures/full_fig_p043_35.png] view at source ↗

**Figure 36.** Figure 36: Meridional current at the surface, lead time 10 d, init 2024-12-24. [PITH_FULL_IMAGE:figures/full_fig_p044_36.png] view at source ↗

**Figure 37.** Figure 37: Sea surface height at lead time 10 d, init 2024-12-24. [PITH_FULL_IMAGE:figures/full_fig_p044_37.png] view at source ↗

**Figure 38.** Figure 38: Surface variables: SLA, SIC and SIT. Reanalysis variants are shown dashed and analysis [PITH_FULL_IMAGE:figures/full_fig_p045_38.png] view at source ↗

**Figure 39.** Figure 39: Temperature at 1, 9, 28, 47 and 91 m depth. [PITH_FULL_IMAGE:figures/full_fig_p046_39.png] view at source ↗

**Figure 40.** Figure 40: Salinity at 1, 9, 28, 47 and 91 m depth. [PITH_FULL_IMAGE:figures/full_fig_p047_40.png] view at source ↗

**Figure 41.** Figure 41: Meridional current at 1, 9, 28, 47 and 91 m depth. [PITH_FULL_IMAGE:figures/full_fig_p048_41.png] view at source ↗

**Figure 42.** Figure 42: Sea ice concentration at lead time 10 d, init 2024-02-20. [PITH_FULL_IMAGE:figures/full_fig_p049_42.png] view at source ↗

**Figure 43.** Figure 43: Sea ice thickness at lead time 10 d, init 2024-02-20. [PITH_FULL_IMAGE:figures/full_fig_p050_43.png] view at source ↗

**Figure 44.** Figure 44: Temperature at the surface, lead time 10 d, init 2024-02-20. [PITH_FULL_IMAGE:figures/full_fig_p050_44.png] view at source ↗

**Figure 45.** Figure 45: Salinity at the surface, lead time 10 d, init 2024-02-20. [PITH_FULL_IMAGE:figures/full_fig_p051_45.png] view at source ↗

**Figure 46.** Figure 46: Zonal current at the surface, lead time 10 d, init 2024-02-20. [PITH_FULL_IMAGE:figures/full_fig_p051_46.png] view at source ↗

**Figure 47.** Figure 47: Meridional current at the surface, lead time 10 d, init 2024-02-20. [PITH_FULL_IMAGE:figures/full_fig_p052_47.png] view at source ↗

**Figure 48.** Figure 48: Sea level anomaly at lead time 10 d, init 2024-02-20. [PITH_FULL_IMAGE:figures/full_fig_p052_48.png] view at source ↗

read the original abstract

Ocean dynamics are inherently chaotic, yet existing machine learning ocean models produce only deterministic forecasts. We introduce Njord, a probabilistic data-driven model for ocean forecasting, applicable to both global and regional domains. Njord combines a deep latent variable framework with a graph neural network architecture, enabling sampling each forecast step in a single forward pass. We apply Njord globally at 0.25{\deg} resolution and regionally to the Baltic Sea at 2 km resolution. To scale to these large ocean grids we introduce K-means cluster meshes that adapt to irregular sea surface geometry. Experiments demonstrate strong performance on both domains compared to deterministic machine learning baselines, while also providing uncertainty estimates from the sampled ensemble forecasts. On the global OceanBench benchmark, Njord achieves the lowest errors on average across upper-ocean variables when evaluated against real-world observations, with the largest improvements in surface temperature prediction.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 1 minor

Summary. The manuscript introduces Njord, a probabilistic graph neural network for ensemble ocean forecasting that combines a deep latent variable model with GNN message passing to generate sampled forecasts in a single forward pass. It scales the approach to a global 0.25° grid and a regional 2 km Baltic Sea grid by introducing K-means cluster meshes that adapt to irregular sea-surface geometry. The central empirical claim is that Njord attains the lowest average errors across upper-ocean variables on the OceanBench benchmark when evaluated against real-world observations, with the largest gains in surface temperature, while also supplying uncertainty estimates from the ensemble.

Significance. If the performance and scaling claims are substantiated, the work would be significant for demonstrating that probabilistic GNNs can deliver calibrated ensemble forecasts for chaotic ocean dynamics at both global and high-resolution regional scales. The provision of uncertainty estimates alongside competitive point forecasts against real observations addresses a practical gap in existing deterministic ML ocean models. The adaptive mesh construction, if shown to respect physical boundaries, could serve as a reusable technique for applying graph-based methods to masked geophysical domains.

major comments (1)

[Abstract] Abstract and mesh-construction section: the claim that K-means cluster meshes 'adapt to irregular sea surface geometry' is load-bearing for the scaling argument to 0.25° global and 2 km regional grids, yet no description is given of how land-sea masks are enforced, whether invalid cross-land edges are removed, or what mesh-quality metrics (e.g., connectivity, boundary fidelity) are satisfied. Standard K-means on latitude-longitude coordinates does not inherently respect masks; without explicit post-processing or boundary-aware clustering, message passing can produce unphysical connections, undermining the applicability claim.

minor comments (1)

[Abstract] Abstract: quantitative error values, baseline definitions, and training details are omitted even though the headline performance claim is stated; adding at least the key RMSE or MAE numbers and the names of the deterministic ML baselines would improve readability.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their constructive and detailed review. The concern about insufficient description of the K-means mesh construction and mask handling is well-taken. We address this point below and will revise the manuscript to provide the requested technical details.

read point-by-point responses

Referee: [Abstract] Abstract and mesh-construction section: the claim that K-means cluster meshes 'adapt to irregular sea surface geometry' is load-bearing for the scaling argument to 0.25° global and 2 km regional grids, yet no description is given of how land-sea masks are enforced, whether invalid cross-land edges are removed, or what mesh-quality metrics (e.g., connectivity, boundary fidelity) are satisfied. Standard K-means on latitude-longitude coordinates does not inherently respect masks; without explicit post-processing or boundary-aware clustering, message passing can produce unphysical connections, undermining the applicability claim.

Authors: We agree that the manuscript currently provides insufficient detail on how the K-means meshes enforce land-sea boundaries. In the revised version we will expand the mesh-construction section with the following additions: (i) clustering is performed exclusively on sea-grid points identified by the land-sea mask; (ii) after clustering, any graph edges connecting nodes separated by land are explicitly removed by a post-processing step that checks line-of-sight connectivity within the masked domain; (iii) we will report quantitative mesh-quality metrics including average node degree, fraction of boundary nodes, and verification that no cross-land edges remain. These clarifications will substantiate the adaptation claim and rule out unphysical message passing. We believe the revised description will fully address the referee’s concern. revision: yes

Circularity Check

0 steps flagged

No circularity; derivation and claims are self-contained with external validation

full rationale

The paper presents Njord as a novel probabilistic latent-variable GNN for ensemble ocean forecasting, with K-means cluster meshes introduced to handle irregular sea-surface geometry at global 0.25° and regional 2 km scales. The central performance claim rests on evaluation against real-world observations on the public OceanBench benchmark, which is independent of the model's fitted parameters or internal definitions. No equations, predictions, or uniqueness arguments in the abstract or described content reduce by construction to inputs, self-citations, or ansatzes; the architecture and mesh adaptation are positioned as original contributions whose validity is tested externally rather than assumed via prior self-referential results.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The work rests on standard neural-network training assumptions and the domain premise that graph representations plus clustering suffice for ocean geometry; no new physical axioms or invented entities are introduced.

free parameters (1)

Neural network hyperparameters (depth, width, learning rate, latent dimension)
Chosen or tuned during training; typical for any deep learning model and not derived from first principles.

axioms (1)

domain assumption Ocean dynamics on irregular domains can be faithfully represented by graph neural networks on K-means-derived meshes
Invoked to justify scaling to global and regional grids; stated in the abstract description of the architecture.

pith-pipeline@v0.9.0 · 5684 in / 1251 out tokens · 73474 ms · 2026-05-19T14:39:20.562050+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

To construct a graph better adapted to the geometry of the global ocean we instead place the graph nodes based on the density of ocean grid points. We apply spherical K-means clustering of the ocean grid point 3D Cartesian coordinates...
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Njord combines a deep latent variable framework with a graph neural network architecture, enabling sampling each forecast step in a single forward pass.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

46 extracted references · 46 canonical work pages · 2 internal anchors

[1]

From observation to information and users: The Copernicus Marine Service perspective.Frontiers in Marine Science, 6:234, 2019

Pierre Yves Le Traon, Antonio Reppucci, Enrique Alvarez Fanjul, Lotfi Aouf, Arno Behrens, Maria Belmonte, Abderrahim Bentamy, Laurent Bertino, Vittorio Ernesto Brando, Matilde Brandt Kreiner, et al. From observation to information and users: The Copernicus Marine Service perspective.Frontiers in Marine Science, 6:234, 2019

work page 2019
[2]

Evolution of the Copernicus Marine Service global ocean analysis and forecasting high-resolution system: Potential benefit for a wide range of users

Jean-Michel Lellouche, Eric Greiner, Giovanni Ruggiero, Romain Bourdallé-Badie, Charles- Emmanuel Testut, Olivier Le Galloudec, Mounir Benkiran, and Gilles Garric. Evolution of the Copernicus Marine Service global ocean analysis and forecasting high-resolution system: Potential benefit for a wide range of users. InEuroGOOS International Conference, volume...

work page 2023
[3]

Nemo-Nordic 2.0: Operational marine forecast model for the Baltic Sea.Geoscientific Model Development, 14(9):5731–5749, 2021

Tuomas Kärnä, Patrik Ljungemyr, Saeed Falahat, Ida Ringgaard, Lars Axell, Vasily Korabel, Jens Murawski, Ilja Maljutenko, Anja Lindenthal, Simon Jandt-Scheelke, et al. Nemo-Nordic 2.0: Operational marine forecast model for the Baltic Sea.Geoscientific Model Development, 14(9):5731–5749, 2021

work page 2021
[4]

GLONET: Mercator’s end-to-end neural global ocean forecasting system.Journal of Geophysical Research: Machine Learning and Computation, 2(3), 2025

Anass El Aouni, Quentin Gaudel, Charly Regnier, Simon Van Gennip, Olivier Le Galloudec, Marie Drevillon, Yann Drillet, and Jean-Michel Lellouche. GLONET: Mercator’s end-to-end neural global ocean forecasting system.Journal of Geophysical Research: Machine Learning and Computation, 2(3), 2025

work page 2025
[5]

Accurate Mediter- ranean Sea forecasting via graph-based deep learning.Scientific Reports, 15(45051), 2025

Daniel Holmberg, Emanuela Clementi, Italo Epicoco, and Teemu Roos. Accurate Mediter- ranean Sea forecasting via graph-based deep learning.Scientific Reports, 15(45051), 2025

work page 2025
[6]

Forecasting the eddying ocean with a deep neural network

Yingzhe Cui, Ruohan Wu, Xiang Zhang, Ziqi Zhu, Bo Liu, Jun Shi, Junshi Chen, Hailong Liu, Shenghui Zhou, Liang Su, et al. Forecasting the eddying ocean with a deep neural network. Nature Communications, 16(1):2268, 2025. 10

work page 2025
[7]

XiHe: A data-driven model for global ocean eddy-resolving forecasting.arXiv preprint arXiv:2402.02995, 2024

Xiang Wang, Renzhi Wang, Ningzi Hu, Pinqiang Wang, Peng Huo, Guihua Wang, Huizan Wang, Senzhang Wang, Junxing Zhu, Jianbo Xu, et al. XiHe: A data-driven model for global ocean eddy-resolving forecasting.arXiv preprint arXiv:2402.02995, 2024

work page arXiv 2024
[8]

FuXi-Ocean: A global ocean forecasting system with sub-daily resolution

Qiusheng Huang, Yuan Niu, Xiaohui Zhong, Anboyu Guo, Lei Chen, Dianjun Zhang, Xuefeng Zhang, and Hao Li. FuXi-Ocean: A global ocean forecasting system with sub-daily resolution. InAdvances in Neural Information Processing Systems, volume 38, 2025

work page 2025
[9]

Probabilistic weather forecasting with hierarchical graph neural networks

Joel Oskarsson, Tomas Landelius, Marc P Deisenroth, and Fredrik Lindsten. Probabilistic weather forecasting with hierarchical graph neural networks. InAdvances in Neural Informa- tion Processing Systems, volume 37, 2024

work page 2024
[10]

Proba- bilistic weather forecasting with machine learning.Nature, 637(8044):84–90, 2025

Ilan Price, Alvaro Sanchez-Gonzalez, Ferran Alet, Tom R Andersson, Andrew El-Kadi, Do- minic Masters, Timo Ewalds, Jacklynn Stott, Shakir Mohamed, Peter Battaglia, et al. Proba- bilistic weather forecasting with machine learning.Nature, 637(8044):84–90, 2025

work page 2025
[11]

Ocean- Net: A principled neural operator-based digital twin for regional oceans.Scientific Reports, 14 (21181), 2024

Ashesh Chattopadhyay, Michael Gray, Tianning Wu, Anna B Lowe, and Ruoying He. Ocean- Net: A principled neural operator-based digital twin for regional oceans.Scientific Reports, 14 (21181), 2024

work page 2024
[12]

OceanBench: A benchmark for data-driven global ocean forecasting systems

Anass El Aouni, Quentin Gaudel, Juan Emmanuel Johnson, Regnier Charly, Julien Le Sommer, Ronan Fablet, Marie Drevillon, Yann Drillet, Pierre Yves Le Traon, et al. OceanBench: A benchmark for data-driven global ocean forecasting systems. InNeural Information Processing Systems, volume 39, 2025

work page 2025
[13]

Seasonal Arctic sea ice forecasting with probabilistic deep learning.Nature Communications, 12(1):5124, 2021

Tom R Andersson, J Scott Hosking, María Pérez-Ortiz, Brooks Paige, Andrew Elliott, Chris Russell, Stephen Law, Daniel C Jones, Jeremy Wilkinson, Tony Phillips, et al. Seasonal Arctic sea ice forecasting with probabilistic deep learning.Nature Communications, 12(1):5124, 2021

work page 2021
[14]

Coupled ocean-atmosphere dynamics in a machine learning Earth system model.arXiv preprint arXiv:2406.08632, 2024

Chenggong Wang, Michael S Pritchard, Noah Brenowitz, Yair Cohen, Boris Bonev, Thorsten Kurth, Dale Durran, and Jaideep Pathak. Coupled ocean-atmosphere dynamics in a machine learning Earth system model.arXiv preprint arXiv:2406.08632, 2024

work page arXiv 2024
[15]

Samudra: An AI global ocean emulator for climate.Geo- physical Research Letters, 52(10), 2025

Surya Dheeshjith, Adam Subel, Alistair Adcroft, Julius Busecke, Carlos Fernandez-Granda, Shubham Gupta, and Laure Zanna. Samudra: An AI global ocean emulator for climate.Geo- physical Research Letters, 52(10), 2025

work page 2025
[16]

Data-driven ensemble prediction of the global ocean.arXiv preprint arXiv:2603.19591, 2026

Qiusheng Huang, Xiaohui Zhong, Anboyu Guo, Ziyi Peng, Lei Chen, and Hao Li. Data-driven ensemble prediction of the global ocean.arXiv preprint arXiv:2603.19591, 2026

work page arXiv 2026
[17]

Kilometer-scale convection-allowing model emulation using generative diffusion modeling.Science Advances, 12(5):eadv0423, 2026

Jaideep Pathak, Yair Cohen, Piyush Garg, Peter Harrington, Noah Brenowitz, Dale Durran, Morteza Mardani, Arash Vahdat, Shaoming Xu, Karthik Kashinath, et al. Kilometer-scale convection-allowing model emulation using generative diffusion modeling.Science Advances, 12(5):eadv0423, 2026

work page 2026
[18]

Diffusion-LAM: Prob- abilistic limited area weather forecasting with diffusion

Erik Larsson, Joel Oskarsson, Tomas Landelius, and Fredrik Lindsten. Diffusion-LAM: Prob- abilistic limited area weather forecasting with diffusion. InICLR 2025 Workshop on Tackling Climate Change with Machine Learning, 2025

work page 2025
[19]

AIFS-CRPS: Ensemble forecasting using a model trained with a loss function based on the continuous ranked probability score.npj Artificial Intelligence, 2(1):18, 2026

Simon Lang, Mihai Alexe, Mariana CA Clare, Christopher Roberts, Rilwan Adewoyin, Zied Ben Bouallègue, Matthew Chantry, Jesper Dramsch, Peter D Dueben, Sara Hahner, et al. AIFS-CRPS: Ensemble forecasting using a model trained with a loss function based on the continuous ranked probability score.npj Artificial Intelligence, 2(1):18, 2026

work page 2026
[20]

Probabilis- tic forecasting with generative networks via scoring rule minimization.Journal of Machine Learning Research, 25(45):1–64, 2024

Lorenzo Pacchiardi, Rilwan A Adewoyin, Peter Dueben, and Ritabrata Dutta. Probabilis- tic forecasting with generative networks via scoring rule minimization.Journal of Machine Learning Research, 25(45):1–64, 2024

work page 2024
[21]

arXiv, ://arxiv.org/abs/2507.12144, arXiv:2507.12144 [cs], doi:10.48550/arXiv.2507.12144

Boris Bonev, Thorsten Kurth, Ankur Mahesh, Mauro Bisson, Jean Kossaifi, Karthik Kashinath, Anima Anandkumar, William D Collins, Michael S Pritchard, and Alexander Keller. FourCast- Net 3: A geometric approach to probabilistic machine-learning weather forecasting at scale. arXiv preprint arXiv:2507.12144, 2025. 11

work page arXiv 2025
[22]

arXiv, ://arxiv.org/abs/2506.10772, arXiv:2506.10772 [cs], doi:10.48550/arXiv.2506.10772

Ferran Alet, Ilan Price, Andrew El-Kadi, Dominic Masters, Stratis Markou, Tom R Andersson, Jacklynn Stott, Remi Lam, Matthew Willson, Alvaro Sanchez-Gonzalez, et al. Skillful joint probabilistic weather forecasting from marginals.arXiv preprint arXiv:2506.10772, 2025

work page arXiv 2025
[23]

CRPS-LAM: Regional ensemble weather forecasting from matching marginals

Erik Larsson, Joel Oskarsson, Tomas Landelius, and Fredrik Lindsten. CRPS-LAM: Regional ensemble weather forecasting from matching marginals. InEurIPS 2025 Workshop on AI for Climate and Conservation, 2025

work page 2025
[24]

High-resolution probabilistic data-driven weather modeling with a stretched-grid.arXiv preprint arXiv:2511.23043, 2025

Even Marius Nordhagen, Håvard Homleid Haugen, Aram Farhad Shafiq Salihi, Magnus Sikora Ingstad, Thomas Nils Nipen, Ivar Ambjørn Seierstad, Inger-Lise Frogner, Mariana Clare, Si- mon Lang, Matthew Chantry, et al. High-resolution probabilistic data-driven weather modeling with a stretched-grid.arXiv preprint arXiv:2511.23043, 2025

work page arXiv 2025
[25]

Learning structured output representation using deep conditional generative models

Kihyuk Sohn, Honglak Lee, and Xinchen Yan. Learning structured output representation using deep conditional generative models. In C. Cortes, N. Lawrence, D. Lee, M. Sugiyama, and R. Garnett, editors,Advances in Neural Information Processing Systems, volume 28, 2015

work page 2015
[26]

AERIS: Argonne Earth systems model for reliable and skillful predictions

Väinö Hatanpää, Eugene Ku, Jason Stock, Murali Emani, Sam Foreman, Chunyong Jung, Sandeep Madireddy, Tung Nguyen, Varuni Sastry, Ray AO Sinurat, et al. AERIS: Argonne Earth systems model for reliable and skillful predictions. InProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, pages 72– 85, 2025

work page 2025
[27]

To- wards diffusion models for large-scale sea-ice modelling

Tobias Sebastian Finn, Charlotte Durand, Alban Farchi, Marc Bocquet, and Julien Brajard. To- wards diffusion models for large-scale sea-ice modelling. InICML 2024 Workshop on Machine Learning for Earth System Modeling, 2024

work page 2024
[28]

SwinVRNN: A data-driven ensemble fore- casting model via learned distribution perturbation.Journal of Advances in Modeling Earth Systems, 15(2), 2023

Yuan Hu, Lei Chen, Zhibin Wang, and Hao Li. SwinVRNN: A data-driven ensemble fore- casting model via learned distribution perturbation.Journal of Advances in Modeling Earth Systems, 15(2), 2023

work page 2023
[29]

Interaction networks for learning about objects, relations and physics

Peter Battaglia, Razvan Pascanu, Matthew Lai, Danilo Jimenez Rezende, et al. Interaction networks for learning about objects, relations and physics. InAdvances in Neural Information Processing Systems, volume 29, 2016

work page 2016
[30]

Learning skillful medium-range global weather forecasting.Science, 382(6677):1416–1421, 2023

Remi Lam, Alvaro Sanchez-Gonzalez, Matthew Willson, Peter Wirnsberger, Meire Fortunato, Ferran Alet, Suman Ravuri, Timo Ewalds, Zach Eaton-Rosen, Weihua Hu, et al. Learning skillful medium-range global weather forecasting.Science, 382(6677):1416–1421, 2023

work page 2023
[31]

arXiv, ://arxiv.org/abs/2406.01465, doi:10.48550/arXiv.2406.01465

Simon Lang, Mihai Alexe, Matthew Chantry, Jesper Dramsch, Florian Pinault, Baudouin Raoult, Mariana CA Clare, Christian Lessig, Michael Maier-Gerber, Linus Magnusson, et al. AIFS–ECMWF’s data-driven forecasting system.arXiv preprint arXiv:2406.01465, 2024

work page arXiv 2024
[32]

Convolutional conditional neural processes

Jonathan Gordon, Wessel P Bruinsma, Andrew YK Foong, James Requeima, Yann Dubois, and Richard E Turner. Convolutional conditional neural processes. InInternational Conference on Learning Representations, 2020

work page 2020
[33]

A foundation model for the Earth system.Nature, 641(8065):1180–1187, 2025

Cristian Bodnar, Wessel P Bruinsma, Ana Lucic, Megan Stanley, Anna Allen, Johannes Brand- stetter, Patrick Garvan, Maik Riechert, Jonathan A Weyn, Haiyu Dong, et al. A foundation model for the Earth system.Nature, 641(8065):1180–1187, 2025

work page 2025
[34]

Andreas Griewank and Andrea Walther. Algorithm 799: Revolve: An implementation of checkpointing for the reverse or adjoint mode of computational differentiation.ACM Transac- tions on Mathematical Software, 26(1):19–45, 2000

work page 2000
[35]

Training Deep Nets with Sublinear Memory Cost

Tianqi Chen, Bing Xu, Chiyuan Zhang, and Carlos Guestrin. Training deep nets with sublinear memory cost.arXiv preprint arXiv:1604.06174, 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016
[36]

Regional ocean forecasting with hierarchical graph neural networks

Daniel Holmberg, Emanuela Clementi, and Teemu Roos. Regional ocean forecasting with hierarchical graph neural networks. InNeurIPS 2024 Workshop on Tackling Climate Change with Machine Learning, 2024. 12

work page 2024
[37]

Building machine learning limited area models: Kilometer-scale weather forecasting in realistic settings.arXiv preprint arXiv:2504.09340, 2025

Simon Adamov, Joel Oskarsson, Leif Denby, Tomas Landelius, Kasper Hintz, Simon Chris- tiansen, Irene Schicker, Carlos Osuna, Fredrik Lindsten, Oliver Fuhrer, et al. Building machine learning limited area models: Kilometer-scale weather forecasting in realistic settings.arXiv preprint arXiv:2504.09340, 2025

work page arXiv 2025
[38]

The Copernicus global 1/12 oceanic and sea ice GLORYS12 reanalysis.Frontiers in Earth Science, 9:698876, 2021

Jean-Michel Lellouche, Eric Greiner, Romain Bourdallé-Badie, Gilles Garric, Angélique Melet, Marie Drévillon, Clément Bricaud, Mathieu Hamon, Olivier Le Galloudec, Charly Reg- nier, et al. The Copernicus global 1/12 oceanic and sea ice GLORYS12 reanalysis.Frontiers in Earth Science, 9:698876, 2021

work page 2021
[39]

NEMO ocean engine

Gurvan Madec and the NEMO team. NEMO ocean engine. Technical report, Institut Pierre- Simon Laplace, 2016

work page 2016
[40]

The ERA5 global reanalysis.Quarterly Journal of the Royal Meteorological Society, 146(730):1999–2049, 2020

Hans Hersbach, Bill Bell, Paul Berrisford, Shoji Hirahara, András Horányi, Joaquín Muñoz- Sabater, Julien Nicolas, Carole Peubey, Raluca Radu, Dinand Schepers, et al. The ERA5 global reanalysis.Quarterly Journal of the Royal Meteorological Society, 146(730):1999–2049, 2020

work page 1999
[41]

Integrated forecasting system, 2024.https://www.ecmwf.int/en/forecasts/ documentation-and-support/changes-ecmwf-model

ECMWF. Integrated forecasting system, 2024.https://www.ecmwf.int/en/forecasts/ documentation-and-support/changes-ecmwf-model

work page 2024
[42]

Copernicus Marine Service Information

E.U. Copernicus Marine Service Information. ODYSSEA global ocean - sea surface tempera- ture multi-sensor L3 observations, 2026. URLhttps://doi.org/10.48670/moi-00164

work page doi:10.48670/moi-00164 2026
[43]

Graph-based neural weather predic- tion for limited area modeling

Joel Oskarsson, Tomas Landelius, and Fredrik Lindsten. Graph-based neural weather predic- tion for limited area modeling. InNeurIPS 2023 Workshop on Tackling Climate Change with Machine Learning, 2023

work page 2023
[44]

C. A. T. Ferro. Fair scores for ensemble forecasts.Quarterly Journal of the Royal Meteoro- logical Society, 140(683):1917–1923, 2014

work page 1917
[45]

Decoupled weight decay regularization

Ilya Loshchilov and Frank Hutter. Decoupled weight decay regularization. InInternational Conference on Learning Representations, 2019

work page 2019
[46]

Magneto-thermoelectric current induced by phonon drag in low-dimensional junctions

V . Fortin, M. Abaza, F. Anctil, and R. Turcotte. Why should ensemble spread match the RMSE of the ensemble mean?Journal of Hydrometeorology, 15(4):1708 – 1713, 2014. A Model Details A.1 Graph-EFM details We adopt the probabilistic framework of Graph-EFM [9], a latent variable model in which stochas- ticity is introduced through latent variablesZdefined o...

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48670/moi-00021 2014

[1] [1]

From observation to information and users: The Copernicus Marine Service perspective.Frontiers in Marine Science, 6:234, 2019

Pierre Yves Le Traon, Antonio Reppucci, Enrique Alvarez Fanjul, Lotfi Aouf, Arno Behrens, Maria Belmonte, Abderrahim Bentamy, Laurent Bertino, Vittorio Ernesto Brando, Matilde Brandt Kreiner, et al. From observation to information and users: The Copernicus Marine Service perspective.Frontiers in Marine Science, 6:234, 2019

work page 2019

[2] [2]

Evolution of the Copernicus Marine Service global ocean analysis and forecasting high-resolution system: Potential benefit for a wide range of users

Jean-Michel Lellouche, Eric Greiner, Giovanni Ruggiero, Romain Bourdallé-Badie, Charles- Emmanuel Testut, Olivier Le Galloudec, Mounir Benkiran, and Gilles Garric. Evolution of the Copernicus Marine Service global ocean analysis and forecasting high-resolution system: Potential benefit for a wide range of users. InEuroGOOS International Conference, volume...

work page 2023

[3] [3]

Nemo-Nordic 2.0: Operational marine forecast model for the Baltic Sea.Geoscientific Model Development, 14(9):5731–5749, 2021

Tuomas Kärnä, Patrik Ljungemyr, Saeed Falahat, Ida Ringgaard, Lars Axell, Vasily Korabel, Jens Murawski, Ilja Maljutenko, Anja Lindenthal, Simon Jandt-Scheelke, et al. Nemo-Nordic 2.0: Operational marine forecast model for the Baltic Sea.Geoscientific Model Development, 14(9):5731–5749, 2021

work page 2021

[4] [4]

GLONET: Mercator’s end-to-end neural global ocean forecasting system.Journal of Geophysical Research: Machine Learning and Computation, 2(3), 2025

Anass El Aouni, Quentin Gaudel, Charly Regnier, Simon Van Gennip, Olivier Le Galloudec, Marie Drevillon, Yann Drillet, and Jean-Michel Lellouche. GLONET: Mercator’s end-to-end neural global ocean forecasting system.Journal of Geophysical Research: Machine Learning and Computation, 2(3), 2025

work page 2025

[5] [5]

Accurate Mediter- ranean Sea forecasting via graph-based deep learning.Scientific Reports, 15(45051), 2025

Daniel Holmberg, Emanuela Clementi, Italo Epicoco, and Teemu Roos. Accurate Mediter- ranean Sea forecasting via graph-based deep learning.Scientific Reports, 15(45051), 2025

work page 2025

[6] [6]

Forecasting the eddying ocean with a deep neural network

Yingzhe Cui, Ruohan Wu, Xiang Zhang, Ziqi Zhu, Bo Liu, Jun Shi, Junshi Chen, Hailong Liu, Shenghui Zhou, Liang Su, et al. Forecasting the eddying ocean with a deep neural network. Nature Communications, 16(1):2268, 2025. 10

work page 2025

[7] [7]

XiHe: A data-driven model for global ocean eddy-resolving forecasting.arXiv preprint arXiv:2402.02995, 2024

Xiang Wang, Renzhi Wang, Ningzi Hu, Pinqiang Wang, Peng Huo, Guihua Wang, Huizan Wang, Senzhang Wang, Junxing Zhu, Jianbo Xu, et al. XiHe: A data-driven model for global ocean eddy-resolving forecasting.arXiv preprint arXiv:2402.02995, 2024

work page arXiv 2024

[8] [8]

FuXi-Ocean: A global ocean forecasting system with sub-daily resolution

Qiusheng Huang, Yuan Niu, Xiaohui Zhong, Anboyu Guo, Lei Chen, Dianjun Zhang, Xuefeng Zhang, and Hao Li. FuXi-Ocean: A global ocean forecasting system with sub-daily resolution. InAdvances in Neural Information Processing Systems, volume 38, 2025

work page 2025

[9] [9]

Probabilistic weather forecasting with hierarchical graph neural networks

Joel Oskarsson, Tomas Landelius, Marc P Deisenroth, and Fredrik Lindsten. Probabilistic weather forecasting with hierarchical graph neural networks. InAdvances in Neural Informa- tion Processing Systems, volume 37, 2024

work page 2024

[10] [10]

Proba- bilistic weather forecasting with machine learning.Nature, 637(8044):84–90, 2025

Ilan Price, Alvaro Sanchez-Gonzalez, Ferran Alet, Tom R Andersson, Andrew El-Kadi, Do- minic Masters, Timo Ewalds, Jacklynn Stott, Shakir Mohamed, Peter Battaglia, et al. Proba- bilistic weather forecasting with machine learning.Nature, 637(8044):84–90, 2025

work page 2025

[11] [11]

Ocean- Net: A principled neural operator-based digital twin for regional oceans.Scientific Reports, 14 (21181), 2024

Ashesh Chattopadhyay, Michael Gray, Tianning Wu, Anna B Lowe, and Ruoying He. Ocean- Net: A principled neural operator-based digital twin for regional oceans.Scientific Reports, 14 (21181), 2024

work page 2024

[12] [12]

OceanBench: A benchmark for data-driven global ocean forecasting systems

Anass El Aouni, Quentin Gaudel, Juan Emmanuel Johnson, Regnier Charly, Julien Le Sommer, Ronan Fablet, Marie Drevillon, Yann Drillet, Pierre Yves Le Traon, et al. OceanBench: A benchmark for data-driven global ocean forecasting systems. InNeural Information Processing Systems, volume 39, 2025

work page 2025

[13] [13]

Seasonal Arctic sea ice forecasting with probabilistic deep learning.Nature Communications, 12(1):5124, 2021

Tom R Andersson, J Scott Hosking, María Pérez-Ortiz, Brooks Paige, Andrew Elliott, Chris Russell, Stephen Law, Daniel C Jones, Jeremy Wilkinson, Tony Phillips, et al. Seasonal Arctic sea ice forecasting with probabilistic deep learning.Nature Communications, 12(1):5124, 2021

work page 2021

[14] [14]

Coupled ocean-atmosphere dynamics in a machine learning Earth system model.arXiv preprint arXiv:2406.08632, 2024

Chenggong Wang, Michael S Pritchard, Noah Brenowitz, Yair Cohen, Boris Bonev, Thorsten Kurth, Dale Durran, and Jaideep Pathak. Coupled ocean-atmosphere dynamics in a machine learning Earth system model.arXiv preprint arXiv:2406.08632, 2024

work page arXiv 2024

[15] [15]

Samudra: An AI global ocean emulator for climate.Geo- physical Research Letters, 52(10), 2025

Surya Dheeshjith, Adam Subel, Alistair Adcroft, Julius Busecke, Carlos Fernandez-Granda, Shubham Gupta, and Laure Zanna. Samudra: An AI global ocean emulator for climate.Geo- physical Research Letters, 52(10), 2025

work page 2025

[16] [16]

Data-driven ensemble prediction of the global ocean.arXiv preprint arXiv:2603.19591, 2026

Qiusheng Huang, Xiaohui Zhong, Anboyu Guo, Ziyi Peng, Lei Chen, and Hao Li. Data-driven ensemble prediction of the global ocean.arXiv preprint arXiv:2603.19591, 2026

work page arXiv 2026

[17] [17]

Kilometer-scale convection-allowing model emulation using generative diffusion modeling.Science Advances, 12(5):eadv0423, 2026

Jaideep Pathak, Yair Cohen, Piyush Garg, Peter Harrington, Noah Brenowitz, Dale Durran, Morteza Mardani, Arash Vahdat, Shaoming Xu, Karthik Kashinath, et al. Kilometer-scale convection-allowing model emulation using generative diffusion modeling.Science Advances, 12(5):eadv0423, 2026

work page 2026

[18] [18]

Diffusion-LAM: Prob- abilistic limited area weather forecasting with diffusion

Erik Larsson, Joel Oskarsson, Tomas Landelius, and Fredrik Lindsten. Diffusion-LAM: Prob- abilistic limited area weather forecasting with diffusion. InICLR 2025 Workshop on Tackling Climate Change with Machine Learning, 2025

work page 2025

[19] [19]

AIFS-CRPS: Ensemble forecasting using a model trained with a loss function based on the continuous ranked probability score.npj Artificial Intelligence, 2(1):18, 2026

Simon Lang, Mihai Alexe, Mariana CA Clare, Christopher Roberts, Rilwan Adewoyin, Zied Ben Bouallègue, Matthew Chantry, Jesper Dramsch, Peter D Dueben, Sara Hahner, et al. AIFS-CRPS: Ensemble forecasting using a model trained with a loss function based on the continuous ranked probability score.npj Artificial Intelligence, 2(1):18, 2026

work page 2026

[20] [20]

Probabilis- tic forecasting with generative networks via scoring rule minimization.Journal of Machine Learning Research, 25(45):1–64, 2024

Lorenzo Pacchiardi, Rilwan A Adewoyin, Peter Dueben, and Ritabrata Dutta. Probabilis- tic forecasting with generative networks via scoring rule minimization.Journal of Machine Learning Research, 25(45):1–64, 2024

work page 2024

[21] [21]

arXiv, ://arxiv.org/abs/2507.12144, arXiv:2507.12144 [cs], doi:10.48550/arXiv.2507.12144

Boris Bonev, Thorsten Kurth, Ankur Mahesh, Mauro Bisson, Jean Kossaifi, Karthik Kashinath, Anima Anandkumar, William D Collins, Michael S Pritchard, and Alexander Keller. FourCast- Net 3: A geometric approach to probabilistic machine-learning weather forecasting at scale. arXiv preprint arXiv:2507.12144, 2025. 11

work page arXiv 2025

[22] [22]

arXiv, ://arxiv.org/abs/2506.10772, arXiv:2506.10772 [cs], doi:10.48550/arXiv.2506.10772

Ferran Alet, Ilan Price, Andrew El-Kadi, Dominic Masters, Stratis Markou, Tom R Andersson, Jacklynn Stott, Remi Lam, Matthew Willson, Alvaro Sanchez-Gonzalez, et al. Skillful joint probabilistic weather forecasting from marginals.arXiv preprint arXiv:2506.10772, 2025

work page arXiv 2025

[23] [23]

CRPS-LAM: Regional ensemble weather forecasting from matching marginals

Erik Larsson, Joel Oskarsson, Tomas Landelius, and Fredrik Lindsten. CRPS-LAM: Regional ensemble weather forecasting from matching marginals. InEurIPS 2025 Workshop on AI for Climate and Conservation, 2025

work page 2025

[24] [24]

High-resolution probabilistic data-driven weather modeling with a stretched-grid.arXiv preprint arXiv:2511.23043, 2025

Even Marius Nordhagen, Håvard Homleid Haugen, Aram Farhad Shafiq Salihi, Magnus Sikora Ingstad, Thomas Nils Nipen, Ivar Ambjørn Seierstad, Inger-Lise Frogner, Mariana Clare, Si- mon Lang, Matthew Chantry, et al. High-resolution probabilistic data-driven weather modeling with a stretched-grid.arXiv preprint arXiv:2511.23043, 2025

work page arXiv 2025

[25] [25]

Learning structured output representation using deep conditional generative models

Kihyuk Sohn, Honglak Lee, and Xinchen Yan. Learning structured output representation using deep conditional generative models. In C. Cortes, N. Lawrence, D. Lee, M. Sugiyama, and R. Garnett, editors,Advances in Neural Information Processing Systems, volume 28, 2015

work page 2015

[26] [26]

AERIS: Argonne Earth systems model for reliable and skillful predictions

Väinö Hatanpää, Eugene Ku, Jason Stock, Murali Emani, Sam Foreman, Chunyong Jung, Sandeep Madireddy, Tung Nguyen, Varuni Sastry, Ray AO Sinurat, et al. AERIS: Argonne Earth systems model for reliable and skillful predictions. InProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, pages 72– 85, 2025

work page 2025

[27] [27]

To- wards diffusion models for large-scale sea-ice modelling

Tobias Sebastian Finn, Charlotte Durand, Alban Farchi, Marc Bocquet, and Julien Brajard. To- wards diffusion models for large-scale sea-ice modelling. InICML 2024 Workshop on Machine Learning for Earth System Modeling, 2024

work page 2024

[28] [28]

SwinVRNN: A data-driven ensemble fore- casting model via learned distribution perturbation.Journal of Advances in Modeling Earth Systems, 15(2), 2023

Yuan Hu, Lei Chen, Zhibin Wang, and Hao Li. SwinVRNN: A data-driven ensemble fore- casting model via learned distribution perturbation.Journal of Advances in Modeling Earth Systems, 15(2), 2023

work page 2023

[29] [29]

Interaction networks for learning about objects, relations and physics

Peter Battaglia, Razvan Pascanu, Matthew Lai, Danilo Jimenez Rezende, et al. Interaction networks for learning about objects, relations and physics. InAdvances in Neural Information Processing Systems, volume 29, 2016

work page 2016

[30] [30]

Learning skillful medium-range global weather forecasting.Science, 382(6677):1416–1421, 2023

Remi Lam, Alvaro Sanchez-Gonzalez, Matthew Willson, Peter Wirnsberger, Meire Fortunato, Ferran Alet, Suman Ravuri, Timo Ewalds, Zach Eaton-Rosen, Weihua Hu, et al. Learning skillful medium-range global weather forecasting.Science, 382(6677):1416–1421, 2023

work page 2023

[31] [31]

arXiv, ://arxiv.org/abs/2406.01465, doi:10.48550/arXiv.2406.01465

Simon Lang, Mihai Alexe, Matthew Chantry, Jesper Dramsch, Florian Pinault, Baudouin Raoult, Mariana CA Clare, Christian Lessig, Michael Maier-Gerber, Linus Magnusson, et al. AIFS–ECMWF’s data-driven forecasting system.arXiv preprint arXiv:2406.01465, 2024

work page arXiv 2024

[32] [32]

Convolutional conditional neural processes

Jonathan Gordon, Wessel P Bruinsma, Andrew YK Foong, James Requeima, Yann Dubois, and Richard E Turner. Convolutional conditional neural processes. InInternational Conference on Learning Representations, 2020

work page 2020

[33] [33]

A foundation model for the Earth system.Nature, 641(8065):1180–1187, 2025

Cristian Bodnar, Wessel P Bruinsma, Ana Lucic, Megan Stanley, Anna Allen, Johannes Brand- stetter, Patrick Garvan, Maik Riechert, Jonathan A Weyn, Haiyu Dong, et al. A foundation model for the Earth system.Nature, 641(8065):1180–1187, 2025

work page 2025

[34] [34]

Andreas Griewank and Andrea Walther. Algorithm 799: Revolve: An implementation of checkpointing for the reverse or adjoint mode of computational differentiation.ACM Transac- tions on Mathematical Software, 26(1):19–45, 2000

work page 2000

[35] [35]

Training Deep Nets with Sublinear Memory Cost

Tianqi Chen, Bing Xu, Chiyuan Zhang, and Carlos Guestrin. Training deep nets with sublinear memory cost.arXiv preprint arXiv:1604.06174, 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016

[36] [36]

Regional ocean forecasting with hierarchical graph neural networks

Daniel Holmberg, Emanuela Clementi, and Teemu Roos. Regional ocean forecasting with hierarchical graph neural networks. InNeurIPS 2024 Workshop on Tackling Climate Change with Machine Learning, 2024. 12

work page 2024

[37] [37]

Building machine learning limited area models: Kilometer-scale weather forecasting in realistic settings.arXiv preprint arXiv:2504.09340, 2025

Simon Adamov, Joel Oskarsson, Leif Denby, Tomas Landelius, Kasper Hintz, Simon Chris- tiansen, Irene Schicker, Carlos Osuna, Fredrik Lindsten, Oliver Fuhrer, et al. Building machine learning limited area models: Kilometer-scale weather forecasting in realistic settings.arXiv preprint arXiv:2504.09340, 2025

work page arXiv 2025

[38] [38]

The Copernicus global 1/12 oceanic and sea ice GLORYS12 reanalysis.Frontiers in Earth Science, 9:698876, 2021

Jean-Michel Lellouche, Eric Greiner, Romain Bourdallé-Badie, Gilles Garric, Angélique Melet, Marie Drévillon, Clément Bricaud, Mathieu Hamon, Olivier Le Galloudec, Charly Reg- nier, et al. The Copernicus global 1/12 oceanic and sea ice GLORYS12 reanalysis.Frontiers in Earth Science, 9:698876, 2021

work page 2021

[39] [39]

NEMO ocean engine

Gurvan Madec and the NEMO team. NEMO ocean engine. Technical report, Institut Pierre- Simon Laplace, 2016

work page 2016

[40] [40]

The ERA5 global reanalysis.Quarterly Journal of the Royal Meteorological Society, 146(730):1999–2049, 2020

Hans Hersbach, Bill Bell, Paul Berrisford, Shoji Hirahara, András Horányi, Joaquín Muñoz- Sabater, Julien Nicolas, Carole Peubey, Raluca Radu, Dinand Schepers, et al. The ERA5 global reanalysis.Quarterly Journal of the Royal Meteorological Society, 146(730):1999–2049, 2020

work page 1999

[41] [41]

Integrated forecasting system, 2024.https://www.ecmwf.int/en/forecasts/ documentation-and-support/changes-ecmwf-model

ECMWF. Integrated forecasting system, 2024.https://www.ecmwf.int/en/forecasts/ documentation-and-support/changes-ecmwf-model

work page 2024

[42] [42]

Copernicus Marine Service Information

E.U. Copernicus Marine Service Information. ODYSSEA global ocean - sea surface tempera- ture multi-sensor L3 observations, 2026. URLhttps://doi.org/10.48670/moi-00164

work page doi:10.48670/moi-00164 2026

[43] [43]

Graph-based neural weather predic- tion for limited area modeling

Joel Oskarsson, Tomas Landelius, and Fredrik Lindsten. Graph-based neural weather predic- tion for limited area modeling. InNeurIPS 2023 Workshop on Tackling Climate Change with Machine Learning, 2023

work page 2023

[44] [44]

C. A. T. Ferro. Fair scores for ensemble forecasts.Quarterly Journal of the Royal Meteoro- logical Society, 140(683):1917–1923, 2014

work page 1917

[45] [45]

Decoupled weight decay regularization

Ilya Loshchilov and Frank Hutter. Decoupled weight decay regularization. InInternational Conference on Learning Representations, 2019

work page 2019

[46] [46]

Magneto-thermoelectric current induced by phonon drag in low-dimensional junctions

V . Fortin, M. Abaza, F. Anctil, and R. Turcotte. Why should ensemble spread match the RMSE of the ensemble mean?Journal of Hydrometeorology, 15(4):1708 – 1713, 2014. A Model Details A.1 Graph-EFM details We adopt the probabilistic framework of Graph-EFM [9], a latent variable model in which stochas- ticity is introduced through latent variablesZdefined o...

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48670/moi-00021 2014