pith. sign in

arxiv: 2510.04006 · v2 · pith:3GNNPUW3new · submitted 2025-10-05 · 💻 cs.LG · nlin.CD· physics.ao-ph

Learning more physically realistic dynamics in machine-learning based weather forecasting with latent-space constraints

Pith reviewed 2026-05-21 22:00 UTC · model grok-4.3

classification 💻 cs.LG nlin.CDphysics.ao-ph
keywords machine learning weather forecastinglatent space constraints4DVar data assimilationphysical realismerror covariancerollout trainingautoencoder atmospheric states
0
0 comments X

The pith

Training ML weather models with losses in an autoencoder latent space improves long-term forecast skill and physical realism by capturing cross-variable couplings.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Most machine-learning weather forecast models are trained using simple weighted losses directly on the model outputs over rollout periods. This approach ignores the complex error covariances that arise from physical interactions between different atmospheric variables and across space, often resulting in forecasts that become overly smooth and unrealistic at longer lead times. The paper reformulates the training process as a four-dimensional variational data assimilation problem, treating reanalysis data as imperfect observations whose errors have multivariate structure. By moving the loss computation into a latent space learned by an autoencoder, the method approximates the high-dimensional error covariance as nearly diagonal, which incorporates those physical dependencies without requiring an explicit full covariance matrix. Experiments show that this latent-space constraint produces forecasts with better skill at extended ranges while retaining fine-scale structures and physical consistency more effectively than standard model-space training.

Core claim

The authors show that rollout training with latent-space constraints improves long-term forecast skill, while better preserving fine-scale structures and physical realism than the widely used model-space loss. They achieve this by reformulating model training as a four-dimensional variational data assimilation problem that treats reanalysis data as imperfect observations, allowing the loss to incorporate cross-variable error covariance structures. In practice, computing the loss in an autoencoder-learned latent space of global atmospheric states encodes the complex nonlinear couplings among variables, so that the high-dimensional error covariance matrix in model space can be approximated as

What carries the argument

Autoencoder-learned latent space that approximates the multivariate error covariance matrix as nearly diagonal, enabling simplified incorporation of physical couplings into the rollout training loss.

If this is right

  • Longer-range forecasts maintain higher accuracy because multivariate dependencies are respected during training.
  • Fine-scale atmospheric structures such as fronts and convective cells remain sharper instead of diffusing.
  • Forecasts exhibit greater physical realism with fewer unphysical artifacts like negative moisture or inconsistent pressure fields.
  • The same framework allows joint training on reanalysis fields and heterogeneous observational datasets within one consistent objective.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Operational centers could adopt this training change to reduce post-processing corrections currently needed for model bias.
  • The latent-space approach might extend naturally to other chaotic dynamical systems such as ocean or climate models where similar covariance issues arise.
  • If the autoencoder is trained on additional variables, the method could enforce consistency across an even broader set of physical constraints.

Load-bearing premise

The autoencoder-learned latent space encodes complex nonlinear couplings among atmospheric variables so that the high-dimensional error covariance matrix in model space can be approximated as nearly diagonal.

What would settle it

Run parallel rollout forecasts with the latent-space loss and the standard model-space loss on the same initial conditions, then compare both against independent high-resolution observations at lead times beyond 5 days for metrics of small-scale feature preservation such as front sharpness or precipitation localization.

Figures

Figures reproduced from arXiv: 2510.04006 by Ben Fei, Fenghua Ling, Hang Fan, Juan Nathaniel, Lei Bai, Pierre Gentine, Yi Xiao, Yongquan Qu.

Figure 1
Figure 1. Figure 1: Training deterministic forecast models with (a) model-space constraints and (b) latent-space constraints. The superscript a on model states x and latent states z indicates reanalysis data, while the subscript i denotes the i-th step during rollout training. E and D denote the encoder and decoder of the pretrained autoencoder for reanalysis. The bottom panel shows T500 forecasts initialized from ERA5 reanal… view at source ↗
Figure 2
Figure 2. Figure 2: Comparison of the globally averaged forecast error of DFM-LC, DFM-MC, and the forecast model trained without rollout. Forecasts are initialized twice daily from ERA5 reanalysis throughout 2020 and evaluated against ERA5 using latitude-weighted root mean square error (RMSE). 4.2 Accuracy and Spectral Analysis We first evaluate the forecast accuracy of different models initialized from the ERA5 reanalysis. F… view at source ↗
Figure 3
Figure 3. Figure 3: Zonal power spectra of forecast fields from DFM-LC and DFM-MC at different lead times. Zonal-mean power spectra of (a) Z500 and (b) T850 over the midlatitudes (30°–60° N/S), computed from forecasts at day 1 (solid lines) and day 15 (dashed lines), and compared with ERA5. 4.3 Physical Consistency Diagnostics To further explore the benefits of latent-space constraints in promoting multivariate consistency, w… view at source ↗
Figure 4
Figure 4. Figure 4: Spin-up forecasts of specific humidity at 500 hPa (Q500) from (a) DFM-LC and (b) DFM-MC, initialized from a smoothed initial state. The initial condition is generated by applying bicubic interpolation to a 25× coarsened ERA5 analysis at 00 UTC on 1 January 2020. Geostrophic Balance Geostrophic balance refers to the equilibrium between the Coriolis force and the horizontal pressure gradient force, yielding … view at source ↗
Figure 5
Figure 5. Figure 5: Geostrophic balance diagnostics in forecasts from DFM-LC and DFM-MC. (a) Zonal and meridional components of actual wind (u, v) and geostrophic wind (ug, vg) at 500 hPa over the Northern Hemisphere midlatitudes (30°–60°N), derived from ERA5 reanalysis at 00 UTC on 1 January 2020. (b) Time evolution of the geostrophic imbalance ratio Rimb over 30-day forecasts from DFM-LC and DFM-MC, averaged over forecasts … view at source ↗
Figure 6
Figure 6. Figure 6: Forecast evolution of global-averaged kinetic energy (KE) from DFM-LC and DFM-MC. Kinetic energy is computed as the mean horizontal wind energy across all grid points and pressure levels from 850 hPa to 100 hPa. Results are averaged over forecasts initialized daily at 00 UTC throughout 2020. The ERA5 reanalysis value is shown as a reference (dashed line) [PITH_FULL_IMAGE:figures/full_fig_p011_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Forecasts of velocity potential at 200 hPa from DFM-LC and DFM-MC, with ERA5 shown for reference. Shown is a representative case initialized at 00 UTC on 1 September 2020. The velocity potential is interpreted as a diagnostic proxy for large-scale vertical motion, where positive values correspond to ascent and negative values to subsidence. Here, we calculate the 200 hPa velocity potential (χ200) to reveal… view at source ↗
read the original abstract

Data-driven machine learning (ML) models are reshaping weather forecasting and have shown the potential to accelerate and surpass traditional physics-based approaches, leading to a second revolution in the field after data assimilation. However, most ML forecast models are trained with weighted variable-wise losses on rollout forecasts that neglect cross-variable and spatial error covariance induced by physical coupling, often yielding overly smooth and physically unrealistic long-range forecasts. To address this, we reformulate model training as a four-dimensional variational data assimilation (4DVar) problem that treats reanalysis data as imperfect observations. This enables the loss function to incorporate cross-variable error covariance structures that capture multivariate dependencies and their associated errors. In practice, we approximate this objective by computing the loss in an autoencoder-learned latent space of global atmospheric states. By encoding complex nonlinear couplings among atmospheric variables, this representation allows the high-dimensional, complex error covariance matrix in model space to be approximated as nearly diagonal in latent space, substantially simplifying implementation. We show that rollout training with latent-space constraints improves long-term forecast skill, while better preserving fine-scale structures and physical realism than the widely used model-space loss. Finally, we extend this framework to accommodate heterogeneous data sources, enabling the forecast model to be trained jointly on reanalysis and multi-source observations within a unified theoretical formulation.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper reformulates training of ML-based weather forecast models as a 4DVar data assimilation problem that treats reanalysis as imperfect observations. It approximates the resulting multivariate loss by computing it in the latent space of a trained autoencoder, under the assumption that this encoding of nonlinear cross-variable couplings renders the high-dimensional model-space error covariance nearly diagonal. The central empirical claim is that rollout training with this latent-space loss yields improved long-term forecast skill and better preservation of fine-scale structures and physical realism relative to standard model-space weighted losses; the framework is also extended to heterogeneous data sources.

Significance. If the latent-space diagonal approximation is valid and the reported gains are reproducible, the work would supply a theoretically grounded alternative to ad-hoc variable-wise losses, potentially improving physical consistency in data-driven weather models without requiring explicit covariance estimation in the original high-dimensional space.

major comments (2)
  1. [Abstract and §3.2] Abstract (latent-space approximation paragraph) and §3.2: the claim that the autoencoder latent space 'allows the high-dimensional, complex error covariance matrix in model space to be approximated as nearly diagonal' is load-bearing for attributing any skill or realism gains to covariance-aware training rather than to a generic regularizer. No quantitative diagnostic (e.g., average off-diagonal magnitude of the sample covariance in latent space, or comparison of full vs. diagonal 4DVar objectives) is presented to verify that off-diagonal terms are negligible; without this, the method reduces to an unverified heuristic.
  2. [§4] §4 (experimental results): the reported improvements in long-term skill and physical realism are presented without ablation of the diagonal assumption itself (e.g., comparison against a non-diagonal latent loss or against a model-space loss with explicit covariance). This makes it impossible to isolate whether gains stem from the 4DVar reformulation or from other implementation choices.
minor comments (2)
  1. [Eq. (7)] Notation for the latent-space loss (Eq. 7) should explicitly state the assumed form of the latent covariance (identity or learned diagonal) to avoid ambiguity with standard autoencoder reconstruction losses.
  2. [Figure 3] Figure 3 (forecast examples) would benefit from quantitative insets showing power spectra or gradient magnitudes to support the 'fine-scale structures' claim.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for their constructive and insightful comments on our manuscript. We address each major comment point by point below, with a focus on strengthening the attribution of our results to the proposed latent-space approximation.

read point-by-point responses
  1. Referee: [Abstract and §3.2] Abstract (latent-space approximation paragraph) and §3.2: the claim that the autoencoder latent space 'allows the high-dimensional, complex error covariance matrix in model space to be approximated as nearly diagonal' is load-bearing for attributing any skill or realism gains to covariance-aware training rather than to a generic regularizer. No quantitative diagnostic (e.g., average off-diagonal magnitude of the sample covariance in latent space, or comparison of full vs. diagonal 4DVar objectives) is presented to verify that off-diagonal terms are negligible; without this, the method reduces to an unverified heuristic.

    Authors: We agree that a quantitative verification of the near-diagonality assumption would strengthen the central claim. The autoencoder is trained to capture nonlinear cross-variable couplings in its latent representation, which we expect to reduce the magnitude of off-diagonal error covariances relative to model space. In the revised manuscript we will add a diagnostic: we will compute and report the average absolute off-diagonal element (normalized by the diagonal) of the sample covariance matrix estimated from a large set of latent encodings drawn from reanalysis data. We will also include a brief comparison of forecast skill when using the diagonal latent loss versus a version that retains a small number of leading off-diagonal terms (via a low-rank update) to quantify the approximation error. revision: yes

  2. Referee: [§4] §4 (experimental results): the reported improvements in long-term skill and physical realism are presented without ablation of the diagonal assumption itself (e.g., comparison against a non-diagonal latent loss or against a model-space loss with explicit covariance). This makes it impossible to isolate whether gains stem from the 4DVar reformulation or from other implementation choices.

    Authors: We concur that additional ablations would help isolate the contribution of the latent-space diagonal approximation. We will expand §4 to include (i) a direct comparison against the standard model-space weighted loss (equivalent to a diagonal covariance in model space) and (ii) a discussion of how the latent-space formulation implicitly accounts for cross-variable structure that a simple model-space diagonal loss cannot. A full non-diagonal latent-space loss or an explicit covariance in the original model space remains computationally intractable; the former would require storing and inverting a dense latent covariance at each training step, while the latter is infeasible given the millions of variables in model space. We will therefore clarify these practical constraints in the text and add the feasible ablations described above. revision: partial

standing simulated objections not resolved
  • A direct experimental comparison against a 4DVar objective that uses an explicit full covariance matrix in the original high-dimensional model space, which is computationally prohibitive.

Circularity Check

0 steps flagged

No significant circularity; derivation rests on explicit approximation assumption

full rationale

The paper's core step reformulates rollout training as a 4DVar objective treating reanalysis as imperfect observations, then approximates the full error covariance by computing loss in an autoencoder latent space under the stated assumption that this space encodes nonlinear couplings sufficiently to render the covariance nearly diagonal. This assumption is presented explicitly as a modeling choice that simplifies implementation rather than being derived by construction from the forecast model equations or reduced to a fitted parameter. No self-definitional loops, fitted inputs renamed as predictions, or load-bearing self-citations appear in the provided derivation chain; the claimed improvements in long-term skill and physical realism are positioned as empirical outcomes of the latent-space loss, not tautological consequences of the inputs. The overlap between autoencoder training data and forecast distribution is noted but does not collapse the objective into its own inputs per the paper's equations.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The approach rests on the domain assumption that an autoencoder can compress atmospheric states so that physical couplings make the error covariance nearly diagonal; no free parameters are explicitly fitted in the abstract description, and no new physical entities are postulated.

axioms (1)
  • domain assumption The autoencoder latent space encodes complex nonlinear couplings among atmospheric variables allowing the error covariance to be treated as nearly diagonal
    Invoked to justify the simplification of the 4DVar objective in latent space

pith-pipeline@v0.9.0 · 5784 in / 1285 out tokens · 51179 ms · 2026-05-21T22:00:57.910692+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

46 extracted references · 46 canonical work pages

  1. [1]

    The quiet revolution of numerical weather prediction.Nature, 525(7567):47–55, September 2015

    Peter Bauer, Alan Thorpe, and Gilbert Brunet. The quiet revolution of numerical weather prediction.Nature, 525(7567):47–55, September 2015

  2. [2]

    Accurate medium-range global weather forecasting with 3D neural networks.Nature, 619(7970):533–538, July 2023

    Kaifeng Bi, Lingxi Xie, Hengheng Zhang, Xin Chen, Xiaotao Gu, and Qi Tian. Accurate medium-range global weather forecasting with 3D neural networks.Nature, 619(7970):533–538, July 2023

  3. [3]

    Learning skillful medium-range global weather forecasting.Science, 382(6677):1416–1421, December 2023

    Remi Lam, Alvaro Sanchez-Gonzalez, Matthew Willson, Peter Wirnsberger, Meire Fortunato, Ferran Alet, Suman Ravuri, Timo Ewalds, Zach Eaton-Rosen, Weihua Hu, Alexander Merose, Stephan Hoyer, George Holland, Oriol Vinyals, Jacklynn Stott, Alexander Pritzel, Shakir Mohamed, and Peter Battaglia. Learning skillful medium-range global weather forecasting.Scienc...

  4. [4]

    FuXi: A cascade machine learning forecasting system for 15-day global weather forecast.npj Climate and Atmospheric Science, 6(1):190, November 2023

    Lei Chen, Xiaohui Zhong, Feng Zhang, Yuan Cheng, Yinghui Xu, Yuan Qi, and Hao Li. FuXi: A cascade machine learning forecasting system for 15-day global weather forecast.npj Climate and Atmospheric Science, 6(1):190, November 2023

  5. [5]

    Simon Lang, Mihai Alexe, Matthew Chantry, Jesper Dramsch, Florian Pinault, Baudouin Raoult, Mariana C. A. Clare, Christian Lessig, Michael Maier-Gerber, Linus Magnusson, Zied Ben Bouallègue, Ana Prieto Nemesio, Peter D. Dueben, Andrew Brown, Florian Pappenberger, and Florence Rabier. AIFS – ECMWF’s data-driven forecasting system, August 2024

  6. [6]

    The operational medium-range deterministic weather forecasting can be extended beyond a 10-day lead time

    Kang Chen, Tao Han, Fenghua Ling, Junchao Gong, Lei Bai, Xinyu Wang, Jing-Jia Luo, Ben Fei, Wenlong Zhang, Xi Chen, Leiming Ma, Tianning Zhang, Rui Su, Yuanzheng Ci, Bin Li, Xiaokang Yang, and Wanli Ouyang. The operational medium-range deterministic weather forecasting can be extended beyond a 10-day lead time. Communications Earth & Environment, 6(1), July 2025

  7. [7]

    Brenner, and Stephan Hoyer

    Dmitrii Kochkov, Janni Yuval, Ian Langmore, Peter Norgaard, Jamie Smith, Griffin Mooers, Milan Klöwer, James Lottes, Stephan Rasp, Peter Düben, Sam Hatfield, Peter Battaglia, Alvaro Sanchez-Gonzalez, Matthew Willson, Michael P. Brenner, and Stephan Hoyer. Neural general circulation models for weather and climate.Nature, 632(8027):1060–1066, August 2024. 1...

  8. [8]

    FuXi-ENS: A machine learning model for medium-range ensemble weather forecasting, August 2024

    Xiaohui Zhong, Lei Chen, Hao Li, Jun Liu, Xu Fan, Jie Feng, Kan Dai, Jing-Jia Luo, Jie Wu, and Bo Lu. FuXi-ENS: A machine learning model for medium-range ensemble weather forecasting, August 2024

  9. [9]

    Generative emulation of weather forecast ensembles with diffusion models.Science Advances, 10(13):eadk4489, March 2024

    Lizao Li, Robert Carver, Ignacio Lopez-Gomez, Fei Sha, and John Anderson. Generative emulation of weather forecast ensembles with diffusion models.Science Advances, 10(13):eadk4489, March 2024

  10. [10]

    FengWu-W2S: A deep learning model for seamless weather-to- subseasonal forecast of global atmosphere

    Fenghua Ling, Kang Chen, Jiye Wu, Tao Han, Jing-Jia Luo, and Lei Bai. FengWu-W2S: A deep learning model for seamless weather-to- subseasonal forecast of global atmosphere

  11. [11]

    Simon Lang, Mihai Alexe, Mariana C. A. Clare, Christopher Roberts, Rilwan Adewoyin, Zied Ben Bouallègue, Matthew Chantry, Jesper Dramsch, Peter D. Dueben, Sara Hahner, Pedro Maciel, Ana Prieto-Nemesio, Cathal O’Brien, Florian Pinault, Jan Polster, Baudouin Raoult, Steffen Tietsche, and Martin Leutbecher. AIFS-CRPS: Ensemble forecasting using a model train...

  12. [12]

    Andersson, Jacklynn Stott, Remi Lam, Matthew Willson, Alvaro Sanchez-Gonzalez, and Peter Battaglia

    Ferran Alet, Ilan Price, Andrew El-Kadi, Dominic Masters, Stratis Markou, Tom R. Andersson, Jacklynn Stott, Remi Lam, Matthew Willson, Alvaro Sanchez-Gonzalez, and Peter Battaglia. Skillful joint probabilistic weather forecasting from marginals, June 2025

  13. [13]

    Atmospheric Modeling, Data Assimilation and Predictability

    Eugenia Kalnay. Atmospheric Modeling, Data Assimilation and Predictability. November 2002

  14. [14]

    Imposing the Fundamental Dynamical Constraint of Hydrostatic Balance to Improve Global ML Weather Prediction, June 2025

    Akshay Subramaniam, Dale Durran, David Pruitt, Nathaniel Cresswell-Clay, and William Yik. Imposing the Fundamental Dynamical Constraint of Hydrostatic Balance to Improve Global ML Weather Prediction, June 2025

  15. [15]

    Hans Hersbach, Bill Bell, Paul Berrisford, Shoji Hirahara, András Horányi, Joaquín Muñoz-Sabater, Julien Nicolas, Carole Peubey, Raluca Radu, Dinand Schepers, Adrian Simmons, Cornel Soci, Saleh Abdalla, Xavier Abellan, Gianpaolo Balsamo, Peter Bechtold, Gionata Biavati, Jean Bidlot, Massimo Bonavita, Giovanna De Chiara, Per Dahlgren, Dick Dee, Michail Dia...

  16. [16]

    Vandal, Kate Duffy, Daniel McDuff, Yoni Nachmany, and Chris Hartshorn

    Thomas J. Vandal, Kate Duffy, Daniel McDuff, Yoni Nachmany, and Chris Hartshorn. Global atmospheric data assimilation with multi-modal masked autoencoders, July 2024

  17. [17]

    GraphDOP: Towards skilful data-driven medium-range weather forecasts learnt and initialised directly from observations, December 2024

    Mihai Alexe, Eulalie Boucher, Peter Lean, Ewan Pinnington, Patrick Laloyaux, Anthony McNally, Simon Lang, Matthew Chantry, Chris Burrows, Marcin Chrust, Florian Pinault, Ethel Villeneuve, Niels Bormann, and Sean Healy. GraphDOP: Towards skilful data-driven medium-range weather forecasts learnt and initialised directly from observations, December 2024

  18. [18]

    Neural models of multiscale systems: Conceptual limitations, stochastic parametrizations, and a climate application, July 2025

    Fabrizio Falasca. Neural models of multiscale systems: Conceptual limitations, stochastic parametrizations, and a climate application, July 2025

  19. [19]

    SOME BASIC FORMALISMS IN NUMERICAL V ARIATIONAL ANALYSIS.Monthly Weather Review, 98(12):875–883, December 1970

    YOSHIKAZU SASAKI. SOME BASIC FORMALISMS IN NUMERICAL V ARIATIONAL ANALYSIS.Monthly Weather Review, 98(12):875–883, December 1970

  20. [20]

    Regional Four-Dimensional Variational Data Assimilation in a Quasi-Operational Forecasting Environment.Monthly Weather Review, 121(8):2396–2408, August 1993

    Milija Zupanski. Regional Four-Dimensional Variational Data Assimilation in a Quasi-Operational Forecasting Environment.Monthly Weather Review, 121(8):2396–2408, August 1993

  21. [21]

    A General Weak Constraint Applicable to Operational 4DV AR Data Assimilation Systems

    Dusanka Zupanski. A General Weak Constraint Applicable to Operational 4DV AR Data Assimilation Systems. Monthly Weather Review, 125(9):2274–2292, September 1997

  22. [22]

    G. E. Hinton and R. R. Salakhutdinov. Reducing the Dimensionality of Data with Neural Networks.Science, 313(5786):504–507, July 2006

  23. [23]

    Physically Consistent Global Atmospheric Data Assimilation with Machine Learning in Latent Space, July 2025

    Hang Fan, Lei Bai, Ben Fei, Yi Xiao, Kun Chen, Yubao Liu, Yongquan Qu, Fenghua Ling, and Pierre Gentine. Physically Consistent Global Atmospheric Data Assimilation with Machine Learning in Latent Space, July 2025

  24. [24]

    Online Model Error Correction With Neural Networks in the Incremental 4D-Var Framework.Journal of Advances in Modeling Earth Systems, 15(9):e2022MS003474, September 2023

    Alban Farchi, Marcin Chrust, Marc Bocquet, Patrick Laloyaux, and Massimo Bonavita. Online Model Error Correction With Neural Networks in the Incremental 4D-Var Framework.Journal of Advances in Modeling Earth Systems, 15(9):e2022MS003474, September 2023

  25. [25]

    Yongquan Qu and Xiaoming Shi. Can a machine learning–enabled numerical model help extend effective forecast range through consistently trained subgrid-scale models?Artificial Intelligence for the Earth Systems, 2(1):e220050, 2023

  26. [26]

    Joint parameter and parameterization inference with un- certainty quantification through differentiable programming

    Yongquan Qu, Mohamed Aziz Bhouri, and Pierre Gentine. Joint parameter and parameterization inference with un- certainty quantification through differentiable programming. InICLR 2024 Workshop on AI4DifferentialEquations In Science. 14 arXivTemplateA PREPRINT

  27. [27]

    Multi-task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics

    Roberto Cipolla, Yarin Gal, Alex Kendall, Roberto Cipolla, Yarin Gal, and Alex Kendall. Multi-task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics. In2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7482–7491, Salt Lake City, UT, USA, June 2018. IEEE

  28. [28]

    R. N. Bannister. A review of forecast error covariance statistics in atmospheric variational data assimilation. I: Characteristics and measurements of forecast error covariances.Quarterly Journal of the Royal Meteorological Society, 134(637):1951–1970, October 2008

  29. [29]

    Edward N. Lorenz. The predictability of a flow which possesses many scales of motion.Tellus A: Dynamic Meteorology and Oceanography, 21(3):289, January 1969

  30. [30]

    Qiang Sun and Fuqing Zhang

    Y . Qiang Sun and Fuqing Zhang. A New Theoretical Framework for Understanding Multiscale Atmospheric Predictability.Journal of the Atmospheric Sciences, 77(7):2297–2309, July 2020

  31. [31]

    Prediction error growth in a more realistic atmospheric toy model with three spatiotemporal scales.Geoscientific Model Development, 15(10):4147–4161, May 2022

    Hynek Bednáˇr and Holger Kantz. Prediction error growth in a more realistic atmospheric toy model with three spatiotemporal scales.Geoscientific Model Development, 15(10):4147–4161, May 2022

  32. [32]

    Generative emulation of chaotic dynamics with coherent prior.Computer Methods in Applied Mechanics and Engineering, 448:118410, January 2026

    Juan Nathaniel and Pierre Gentine. Generative emulation of chaotic dynamics with coherent prior.Computer Methods in Applied Mechanics and Engineering, 448:118410, January 2026

  33. [33]

    3D-Var data assimilation using a variational autoencoder.Quarterly Journal of the Royal Meteorological Society, 150(761):2273–2295, April 2024

    Boštjan Melinc and Žiga Zaplotnik. 3D-Var data assimilation using a variational autoencoder.Quarterly Journal of the Royal Meteorological Society, 150(761):2273–2295, April 2024

  34. [34]

    Generating Unseen Nonlinear Evolution in Sea Surface Temperature Using a Deep Learning-Based Latent Space Data Assimilation Framework

    Qingyu Zheng, Guijun Han, Wei Li, Lige Cao, Gongfu Zhou, Haowen Wu, Qi Shao, Ru Wang, Xiaobo Wu, Xudong Cui, Hong Li, and Xuan Wang. Generating Unseen Nonlinear Evolution in Sea Surface Temperature Using a Deep Learning-Based Latent Space Data Assimilation Framework

  35. [35]

    A Novel Latent Space Data Assimilation Framework with Autoencoder-Observation to Latent Space (AE-O2L) Network

    Hang Fan, Yubao Liu, Yuewei Liu, Zhaoyang Huo, Baojun Chen, and Yu Qin. A Novel Latent Space Data Assimilation Framework with Autoencoder-Observation to Latent Space (AE-O2L) Network. Part II: Observation and Background Assimilation with Interpretability.Monthly Weather Review, 153(8):1349–1363, August 2025

  36. [36]

    A unified neural background-error covariance model for midlatitude and tropical atmospheric data assimilation, June 2025

    Boštjan Melinc, Uroš Perkan, and Žiga Zaplotnik. A unified neural background-error covariance model for midlatitude and tropical atmospheric data assimilation, June 2025

  37. [37]

    Swin Transformer: Hierarchical Vision Transformer using Shifted Windows, August 2021

    Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. Swin Transformer: Hierarchical Vision Transformer using Shifted Windows, August 2021

  38. [38]

    CRA5: Extreme Compression of ERA5 for Portable Global Climate and Weather Research via an Efficient Variational Transformer, May 2024

    Tao Han, Zhenghao Chen, Song Guo, Wanghan Xu, and Lei Bai. CRA5: Extreme Compression of ERA5 for Portable Global Climate and Weather Research via an Efficient Variational Transformer, May 2024

  39. [39]

    WeatherBench 2: A benchmark for the next generation of data-driven global weather models, January 2024

    Stephan Rasp, Stephan Hoyer, Alexander Merose, Ian Langmore, Peter Battaglia, Tyler Russel, Alvaro Sanchez- Gonzalez, Vivian Yang, Rob Carver, Shreya Agrawal, Matthew Chantry, Zied Ben Bouallegue, Peter Dueben, Carla Bromberg, Jared Sisk, Luke Barrington, Aaron Bell, and Fei Sha. WeatherBench 2: A benchmark for the next generation of data-driven global we...

  40. [40]

    Zhanshan Ma, Chuanfeng Zhao, Jiandong Gong, Jin Zhang, Zhe Li, Jian Sun, Yongzhu Liu, Jiong Chen, and Qingu Jiang. Spin-up characteristics with three types of initial fields and the restart effects on forecast accuracy in the GRAPES global forecast system.Geoscientific Model Development, 14(1):205–221, January 2021. [41]An Introduction to Dynamic Meteorol...

  41. [41]

    Laloyaux, M

    P. Laloyaux, M. Bonavita, M. Chrust, and S. Gürol. Exploring the potential and limitations of weak-constraint 4D-Var.Quarterly Journal of the Royal Meteorological Society, 146(733):4067–4082, October 2020

  42. [42]

    Koster, Y

    Randal D. Koster, Y . C. Sud, Zhichang Guo, Paul A. Dirmeyer, Gordon Bonan, Keith W. Oleson, Edmond Chan, Diana Verseghy, Peter Cox, Harvey Davies, Eva Kowalczyk, C. T. Gordon, Shinjiro Kanae, David Lawrence, Ping Liu, David Mocko, Cheng-Hsuan Lu, Ken Mitchell, Sergey Malyshev, Bryant McAvaney, Taikan Oki, Tomohito Yamada, Andrew Pitman, Christopher M. Ta...

  43. [43]

    Seneviratne, Daniel Lüthi, Michael Litschi, and Christoph Schär

    Sonia I. Seneviratne, Daniel Lüthi, Michael Litschi, and Christoph Schär. Land–atmosphere coupling and climate change in Europe.Nature, 443(7108):205–209, September 2006

  44. [44]

    Coupled Ocean-Atmosphere Interaction at Oceanic Mesoscales.Oceanog- raphy, 23(4):52–69, December 2010

    Dudley Chelton and Shang-Ping Xie. Coupled Ocean-Atmosphere Interaction at Oceanic Mesoscales.Oceanog- raphy, 23(4):52–69, December 2010

  45. [45]

    Penny and Thomas M

    Stephen G. Penny and Thomas M. Hamill. Coupled Data Assimilation for Integrated Earth System Analysis and Prediction.Bulletin of the American Meteorological Society, 98(7):ES169–ES172, July 2017

  46. [46]

    Coupled data assimilation and parameter estimation in coupled ocean–atmosphere models: A review.Climate Dynamics, 54(11-12):5127–5144, June 2020

    Shaoqing Zhang, Zhengyu Liu, Xuefeng Zhang, Xinrong Wu, Guijun Han, Yuxin Zhao, Xiaolin Yu, Chang Liu, Yun Liu, Shu Wu, Feiyu Lu, Mingkui Li, and Xiong Deng. Coupled data assimilation and parameter estimation in coupled ocean–atmosphere models: A review.Climate Dynamics, 54(11-12):5127–5144, June 2020. 15