An ML model trained only on harmonized gridded observations achieves competitive medium-range weather forecast skill with the IFS for several upper-air and surface headline scores when verified against observations.
Canonical reference
Accurate medium-range global weather forecasting with 3D neural networks , volume =
Canonical reference. 78% of citing Pith papers cite this work as background.
citation-role summary
citation-polarity summary
representative citing papers
MotifGen is the first multi-source generative model for spatiotemporal interpolation of misaligned microwave cyclone images from heterogeneous instruments at irregular intervals, achieving lower CRPS via self-supervised training and closer power spectra than deterministic baselines when combining in
Targeted perturbations in the Aurora AI model can steer Hurricane Sandy's trajectory by more than 500 km after seven days via amplification in sensitive regions identified by FTLE and wave activity diagnostics.
AI weather models may simulate the atmosphere via particle positions in latent space whose updates follow gradient flow on a learned free energy functional rather than conventional physical equations.
Cast3 translates NWP principles into a data-driven model using cubed-sphere grids, super-ensembles, and generative nudging to achieve state-of-the-art ensemble predictions that outperform baselines.
Historically trained ML weather emulators quantify fast precipitation changes from CO2 perturbations and produce results that agree with Earth System Models.
Two new global-domain smoothing methods enable spatial verification scores like FSS on high-resolution global precipitation forecasts while handling grid area variability and missing data.
Online conformal prediction post-processing guarantees calibrated uncertainty coverage for GenCast, NeuralGCM, and AIFS-ENS forecasts of temperature and precipitation including extremes.
Tyan-WP is a pretrained wind power foundation model that outperforms site-specific TSMs and generic LTSMs in zero-shot ultra-short-term probabilistic forecasting on U.S. and U.K. sites via static embeddings and PAMF module.
NTK-UQ produces 31-37% sharper 90% prediction intervals than split conformal prediction for extreme weather forecasts, with adaptive scaling via architecture-dependent eigenvalue truncation and ICA decomposition of last-layer features.
Mechanism learning infers active local evolution rules via prototype-anchored descriptors to achieve more robust forecasting than direct state prediction on benchmarks like Burgers, WeatherBench2, and Lorenz96.
Wavelet Flow Matching emulates multi-scale PDE-governed systems by transporting velocities directly in a hierarchical wavelet representation via U-Net, yielding improved long-horizon stability and spectral accuracy on fluid benchmarks.
Q-SRDRN multi-quantile network with pinball loss and per-quantile heads detects extreme precipitation events up to 18 times more effectively than deterministic baselines while preserving augmentation benefits for the median.
ShardTensor is a domain-parallelism system for SciML that enables flexible scaling of extreme-resolution spatial datasets by removing the constraint of batch size one per device.
AxiomOcean deploys a 3D encoder-backbone-decoder architecture that jointly predicts upper-ocean variables and outperforms prior AI models by 20-35% in day-1 RMSE while preserving eddy kinetic energy and vertical consistency.
Extreme Weather Bench supplies standardized case studies, observational data, impact metrics, and code to evaluate weather models on high-impact hazards.
ESFM is a single open foundation model that unifies heterogeneous Earth data sources and forecasts missing regions while preserving inter-variable physical relationships.
A physics-constrained consistency model downscales Greenland SMB and surface temperature by a factor of 32 while preserving coarse-scale sums and outperforming interpolation on test metrics.
Extends Potential CRPS with weights and IDR post-processing to enable fair comparisons of AIWP and NWP models on extreme weather, finding AI models more informative across most variables and thresholds.
Instrumented data augments observations with mechanistic models, uncertainty, and counterfactuals to enable causal interventions via Pearl's do-operator in scientific machine learning.
A MATLAB/ONNX testbed integrates the Pangu AI model with PID closed-loop control to perform single-input single-output perturbation-response experiments on typhoon track and intensity.
An open-source tool is developed for mechanistic interpretability of AI weather models, demonstrated on GraphCast by identifying latent directions corresponding to interpretable weather features.
CycloneMAE uses a TC structure-aware masked autoencoder with discrete probabilistic gridding and pre-train/fine-tune to deliver both deterministic and probabilistic forecasts, outperforming NWP systems in pressure and wind up to 120 hours and track up to 24 hours across five basins.
AI/ML weather tools face integration challenges from mismatched 'regimes of scale' in how data and models are organized compared to traditional meteorology practices.
citing papers explorer
-
AIFS-DOP: End-to-End Medium-Range Weather Prediction from Observations Alone with Machine Learning
An ML model trained only on harmonized gridded observations achieves competitive medium-range weather forecast skill with the IFS for several upper-air and surface headline scores when verified against observations.
-
MotifGen: Spatiotemporal interpolation of misaligned satellite images via multi-source generative modeling, in an application to tropical cyclones
MotifGen is the first multi-source generative model for spatiotemporal interpolation of misaligned microwave cyclone images from heterogeneous instruments at irregular intervals, achieving lower CRPS via self-supervised training and closer power spectra than deterministic baselines when combining in
-
Steering Tropical Cyclones Using Small Perturbations in an AI Weather Model
Targeted perturbations in the Aurora AI model can steer Hurricane Sandy's trajectory by more than 500 km after seven days via amplification in sensitive regions identified by FTLE and wave activity diagnostics.
-
The physics of AI weather models
AI weather models may simulate the atmosphere via particle positions in latent space whose updates follow gradient flow on a learned free energy functional rather than conventional physical equations.
-
Cast3: Translating numerical weather prediction principles into data-driven forecasting
Cast3 translates NWP principles into a data-driven model using cubed-sphere grids, super-ensembles, and generative nudging to achieve state-of-the-art ensemble predictions that outperform baselines.
-
Examining Fast Radiatively Driven Responses Using Machine-Learning Weather Emulators
Historically trained ML weather emulators quantify fast precipitation changes from CO2 perturbations and produce results that agree with Earth System Models.
-
Smoothing and spatial verification of global fields
Two new global-domain smoothing methods enable spatial verification scores like FSS on high-resolution global precipitation forecasts while handling grid area variability and missing data.
-
Rigorous uncertainty quantification of probabilistic AI weather forecasts with conformal prediction
Online conformal prediction post-processing guarantees calibrated uncertainty coverage for GenCast, NeuralGCM, and AIFS-ENS forecasts of temperature and precipitation including extremes.
-
Tyan-WP: A Wind Power Foundation Model for Ultra-Short-Term Probabilistic Forecasting
Tyan-WP is a pretrained wind power foundation model that outperforms site-specific TSMs and generic LTSMs in zero-shot ultra-short-term probabilistic forecasting on U.S. and U.K. sites via static embeddings and PAMF module.
-
Scalable Uncertainty Quantification for Extreme Weather Forecasting via Empirical Neural Tangent Kernels
NTK-UQ produces 31-37% sharper 90% prediction intervals than split conformal prediction for extreme weather forecasts, with adaptive scaling via architecture-dependent eigenvalue truncation and ICA decomposition of last-layer features.
-
Mechanism Learning: Prototype-Anchored Mechanism Inference for Scientific Forecasting
Mechanism learning infers active local evolution rules via prototype-anchored descriptors to achieve more robust forecasting than direct state prediction on benchmarks like Burgers, WeatherBench2, and Lorenz96.
-
Wavelet Flow Matching for Multi-Scale Physics Emulation
Wavelet Flow Matching emulates multi-scale PDE-governed systems by transporting velocities directly in a hierarchical wavelet representation via U-Net, yielding improved long-horizon stability and spectral accuracy on fluid benchmarks.
-
Multi-Quantile Regression for Extreme Precipitation Downscaling
Q-SRDRN multi-quantile network with pinball loss and per-quantile heads detects extreme precipitation events up to 18 times more effectively than deterministic baselines while preserving augmentation benefits for the median.
-
ShardTensor: Domain Parallelism for Scientific Machine Learning
ShardTensor is a domain-parallelism system for SciML that enables flexible scaling of extreme-resolution spatial datasets by removing the constraint of batch size one per device.
-
AxiomOcean: Forecasting the Three-Dimensional Structure of the Upper Ocean
AxiomOcean deploys a 3D encoder-backbone-decoder architecture that jointly predicts upper-ocean variables and outperforms prior AI models by 20-35% in day-1 RMSE while preserving eddy kinetic energy and vertical consistency.
-
Extreme Weather Bench: A framework and benchmark for evaluation of high-impact weather
Extreme Weather Bench supplies standardized case studies, observational data, impact metrics, and code to evaluate weather models on high-impact hazards.
-
Earth System Foundation Model (ESFM): A unified framework for heterogeneous data integration and forecasting
ESFM is a single open foundation model that unifies heterogeneous Earth data sources and forecasts missing regions while preserving inter-variable physical relationships.
-
Physics-constrained generative machine learning-based high-resolution downscaling of Greenland's surface mass balance and surface temperature
A physics-constrained consistency model downscales Greenland SMB and surface temperature by a factor of 32 while preserving coarse-scale sums and outperforming interpolation on test metrics.
-
Towards Fair Comparisons of AI- and Physics-Based Weather Models for Extreme Events via the Weighted Potential CRPS
Extends Potential CRPS with weights and IDR post-processing to enable fair comparisons of AIWP and NWP models on extreme weather, finding AI models more informative across most variables and thresholds.
-
Instrumented data for causal scientific machine learning
Instrumented data augments observations with mechanistic models, uncertainty, and counterfactuals to enable causal interventions via Pearl's do-operator in scientific machine learning.
-
A Simulation Methodology Testbed for Typhoon Sensitivity Analysis: Framework Development and Perturbation-Response Experiments with the Pangu Weather Model
A MATLAB/ONNX testbed integrates the Pangu AI model with PID closed-loop control to perform single-input single-output perturbation-response experiments on typhoon track and intensity.
-
Mechanistic Interpretability Tool for AI Weather Models
An open-source tool is developed for mechanistic interpretability of AI weather models, demonstrated on GraphCast by identifying latent directions corresponding to interpretable weather features.
-
CycloneMAE: A Scalable Multi-Task Learning Model for Global Tropical Cyclone Probabilistic Forecasting
CycloneMAE uses a TC structure-aware masked autoencoder with discrete probabilistic gridding and pre-train/fine-tune to deliver both deterministic and probabilistic forecasts, outperforming NWP systems in pressure and wind up to 120 hours and track up to 24 hours across five basins.
-
Regimes of Scale in AI Meteorology
AI/ML weather tools face integration challenges from mismatched 'regimes of scale' in how data and models are organized compared to traditional meteorology practices.
-
Sampling Parallelism for Fast and Efficient Bayesian Learning
Sampling parallelism distributes Bayesian sample evaluations across GPUs for near-perfect scaling, lower memory use, and faster convergence via per-GPU data augmentations, outperforming pure data parallelism in diversity.
-
Performance Evaluation of GraphCast for Medium-Range Weather Forecasting over Brazil
GraphCast shows regime-dependent skill versus ECMWF HRES in Brazil, underperforming on winter baroclinic systems in medium range but gaining in extended range and summer moisture transport.
-
Accelerating Redshift-Conditioned Galaxy Image Synthesis with One-step Generative Modeling
One-step pixel-MeanFlow models recover key galaxy morphology statistics at orders-of-magnitude lower computational cost than standard DDPM sampling while remaining weaker on fine-grained structure.
-
Prediction of Drought and Flash Drought in Africa at the Seasonal-to-Subseasonal Scale using the Community Research Earth Digital Intelligence Twin Framework
DroughtFormer predicts soil moisture, vegetation health, and related variables in Africa with skill out to 90 days that matches or exceeds climatology for most targets, but shows lower accuracy for precipitation and flash drought indices.
-
Modelling convective cell occurrence in proximity to cold fronts using extreme gradient boosting
An XGBoost model reproduces convective cell frequency near cold fronts with high skill but underestimates counts at the surface front, depending most on CAPE and time of day.
-
Machine learning is revolutionizing weather forecasting -- the next step is a change in how we work
Machine learning success in weather prediction will drive changes in development practices, data handling, verification, and service creation at weather centers.
- U-Cast: A Surprisingly Simple and Efficient Frontier Probabilistic AI Weather Forecaster
- A PMP-inspired Evaluation Framework for Assessing Deep-Learning Earth System Models