Volador 1.0: A Data-Driven Air-Sea Full-Coupling Regional Forecast Model with Submesoscale-Permitting Based on MOE-Swin-Transformer Framework
Pith reviewed 2026-06-30 16:32 UTC · model grok-4.3
The pith
Volador 1.0 produces 72-hour ocean forecasts in the South China Sea with errors at or below those of reanalysis products and ROMS while resolving submesoscale features.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Volador 1.0 demonstrates that an MoE-Swin-Transformer framework with air-sea full-coupling, Cross-Grid Bidirectional Cross-Attention, and fast-slow dual-branch architecture can deliver 0-72h forecasts of temperature and salinity in the 0-500m layer and sea surface height whose RMSE or MAE is smaller than or at least comparable to those from REDOS V2.0, GLORYS12, and ROMS, while its energy spectrum captures the sub- to mesoscale cascade predicted by classical turbulence theory.
What carries the argument
Mixture-of-Experts Swin-Transformer with latent-space interaction via Cross-Grid Bidirectional Cross-Attention and fast-slow dual-branch architecture that incorporates air-sea full-coupling of momentum and heat fluxes.
If this is right
- Air-sea full-coupling measurably improves forecast skill relative to the non-coupled version of the same architecture.
- The model reproduces submesoscale processes including internal waves in its forecasts.
- Forecasts can be produced faster than with traditional numerical models at comparable accuracy.
- The approach supports operational real-time use for disaster prevention in the tested region.
Where Pith is reading between the lines
- If the architecture transfers, comparable models could be trained for other coastal basins using local reanalysis data.
- Matching the expected turbulence energy spectrum suggests the model has learned scale-interaction physics rather than purely statistical correlations.
- Wider adoption would lower the computational barrier to high-resolution, frequently updated marine forecasts.
Load-bearing premise
The three-month hindcast and 15-day real-time test results in the South China Sea will generalize beyond those specific conditions and periods.
What would settle it
Independent observations or model runs in a new time period or different coastal region where Volador 1.0 errors for temperature, salinity or height exceed those of reanalysis or ROMS by a clear margin.
read the original abstract
A data-driven air-sea full-coupling regional forecast model with submesoscale-permitting, named "Volador 1.0", is developed for the South China Sea (SCS). The model features a Swin-Transformer framework integrated with a Mixture-of-Experts (MoE) system, a latent space interaction architecture based on Cross-Grid Bidirectional Cross-Attention, and a fast-slow dual-branch architecture. Both the three-month hindcast test and the 15-day operational real-time forecasting demonstrate that Volador 1.0 has a very encouraging and promising performance in 0-72h forecasting of temperature and salinity in the 0-500m upper ocean as well as the sea surface height with root-mean-square-error (RMSE) or mean absolute error (MAE) smaller than or at least comparable to those from the reanalysis datasets REDOS V2.0 and GLORYS12 and the state-of-the-art regional numerical model Regional Ocean Modeling System (ROMS). In particular, Volador 1.0 demonstrates its capability of capturing/forecasting submesoscale processes including internal waves, with an energy spectrum well representing sub- to mesoscale energy cascade as expected by the classical turbulence theory. Further analysis based on ablation experiments shows that the air-sea full-coupling framework, which takes into account the dynamic exchanges of momentum and heat fluxes between the atmosphere and the ocean, indeed helps improve the model's performance compared to the non-full-coupling one. Volador 1.0, though still subject to refinement in the coming future with a large space for improvement, blazes a path for an accurate, fine and fast marine environment forecasting, and thus could help promote our capability of disaster prevention and mitigation in the SCS as well as in other coastal regions where these innovative techniques can be applied.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper presents Volador 1.0, a data-driven regional ocean forecast model for the South China Sea based on a Mixture-of-Experts Swin-Transformer architecture with cross-grid bidirectional cross-attention and a fast-slow dual-branch design. It claims that 0-72 h forecasts of temperature and salinity (0-500 m) and sea-surface height achieve RMSE or MAE values smaller than or at least comparable to the reanalysis products REDOS V2.0 and GLORYS12 as well as the ROMS numerical model. The model is further asserted to capture submesoscale processes, including internal waves, with an energy spectrum that reproduces the expected sub- to mesoscale energy cascade from classical turbulence theory. Ablation experiments are said to demonstrate that the full air-sea coupling improves performance relative to a non-coupled variant.
Significance. If the performance and submesoscale claims are supported by rigorous, reproducible validation, the work would constitute a notable step toward operational data-driven regional forecasting at submesoscale-permitting resolution. Demonstrating that an MoE-Transformer with explicit air-sea flux coupling can match or exceed established reanalysis and numerical models in a dynamically complex basin would have direct implications for computational efficiency in marine forecasting and disaster mitigation applications.
major comments (3)
- [Abstract] Abstract: The central performance claims (RMSE/MAE smaller than or comparable to REDOS V2.0, GLORYS12, and ROMS) are stated without any numerical values, error bars, tables, or description of data splits, training procedures, or cross-validation strategy. This information is load-bearing for the claim that the model generalizes rather than overfits the 3-month hindcast + 15-day real-time SCS window.
- [Results] Results / ablation section: The statement that the air-sea full-coupling framework improves performance is presented without quantitative deltas, control experiments that isolate coupling from other architectural choices, or statistical tests. Without these, the attribution of gains specifically to momentum and heat flux exchanges cannot be evaluated.
- [Results] Submesoscale analysis: The claim that the energy spectrum 'well represents' the sub- to mesoscale cascade 'as expected by classical turbulence theory' lacks any reference to expected spectral slopes (e.g., k^{-5/3} or k^{-3}), wavenumber ranges, or direct comparison figures/tables against theory or observations. This is load-bearing for the submesoscale-permitting assertion.
minor comments (2)
- [Title/Abstract] The title uses 'MOE-Swin-Transformer' while the abstract expands 'Mixture-of-Experts (MoE)'; consistent acronym usage and expansion on first use would improve clarity.
- [Abstract] The abstract refers to both 'hindcast test' and 'operational real-time forecasting' without clarifying whether the 15-day period is truly out-of-sample or drawn from the same reanalysis used for training.
Simulated Author's Rebuttal
We thank the referee for the constructive comments that highlight opportunities to strengthen the presentation of quantitative evidence. We address each major point below and will revise the manuscript accordingly.
read point-by-point responses
-
Referee: [Abstract] Abstract: The central performance claims (RMSE/MAE smaller than or comparable to REDOS V2.0, GLORYS12, and ROMS) are stated without any numerical values, error bars, tables, or description of data splits, training procedures, or cross-validation strategy. This information is load-bearing for the claim that the model generalizes rather than overfits the 3-month hindcast + 15-day real-time SCS window.
Authors: We agree that the abstract would be strengthened by including specific numerical values. In the revised manuscript we will insert representative RMSE/MAE figures for temperature, salinity, and SSH over the 0-72 h horizon, together with a brief statement on the hindcast/real-time split and validation approach. Full details of data partitioning, training, and cross-validation remain in the Methods section, but the abstract will now supply the quantitative context needed to evaluate generalization. revision: yes
-
Referee: [Results] Results / ablation section: The statement that the air-sea full-coupling framework improves performance is presented without quantitative deltas, control experiments that isolate coupling from other architectural choices, or statistical tests. Without these, the attribution of gains specifically to momentum and heat flux exchanges cannot be evaluated.
Authors: We accept that quantitative deltas and clearer isolation of the coupling contribution are required. The revised ablation section will report explicit performance differences (RMSE/MAE reductions) between the full air-sea coupling configuration and the non-coupled control, describe how the control experiments hold other architectural elements fixed, and include statistical significance tests. These additions will allow direct evaluation of the role of momentum and heat flux exchanges. revision: yes
-
Referee: [Results] Submesoscale analysis: The claim that the energy spectrum 'well represents' the sub- to mesoscale cascade 'as expected by classical turbulence theory' lacks any reference to expected spectral slopes (e.g., k^{-5/3} or k^{-3}), wavenumber ranges, or direct comparison figures/tables against theory or observations. This is load-bearing for the submesoscale-permitting assertion.
Authors: We agree that explicit theoretical benchmarks and direct comparisons are necessary. The revised submesoscale section will cite the expected spectral slopes (k^{-5/3} in the inertial subrange and k^{-3} at larger scales), specify the wavenumber ranges corresponding to submesoscale and mesoscale regimes, and add figures or tables that overlay the model spectrum against both theoretical lines and available observational references. This will provide the rigorous support required for the submesoscale-permitting claim. revision: yes
Circularity Check
No significant circularity; performance claims rest on empirical test-set evaluation against external benchmarks
full rationale
The manuscript presents an ML architecture (MOE-Swin-Transformer with cross-grid attention and dual-branch design) and reports RMSE/MAE on a 3-month hindcast plus 15-day real-time test for SCS variables, plus energy-spectrum fidelity and ablation results for the coupling component. No equations, derivations, or first-principles results appear; the paper contains no self-definitional relations, fitted parameters renamed as predictions, or load-bearing self-citations. All quantitative claims are framed as direct comparisons to independent reanalysis products (REDOS V2.0, GLORYS12) and the ROMS numerical model, with no indication that test metrics were used in training or that any result reduces to its own inputs by construction. Generalization limits are a validity concern, not a circularity issue.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Glonet: Mercator’s end -to-end neural forecasting system[PP/OL]
AOUNI A E, GAUDEL Q, REGNIER C, et al. Glonet: Mercator’s end -to-end neural forecasting system[PP/OL]. arXiv:2412.05454 (2024) [2026 -05-06]. https://arxiv.org/abs/2412.05454
-
[2]
Accurate medium -range global weather forecasting with 3d neural networks[J]
BI K, XIE L, ZHANG H, et al. Accurate medium -range global weather forecasting with 3d neural networks[J]. Nature, 2023: 1-6
2023
-
[3]
Geostrophic turbulence
CHARNEY J G. Geostrophic turbulence. Journal of the Atmospheric Sciences, 1971, 28: 1087-1095
1971
-
[4]
The HYCOM (HYbrid Coordinate Ocean Model) data assimilative system[J]
CHASSIGNET E P, HURLBURT H E, SMEDSTAD O M, et al. The HYCOM (HYbrid Coordinate Ocean Model) data assimilative system[J]. Journal of Marine Systems, 2007, 65: 60-83
2007
-
[5]
Eddy heat and salt transports in the South China Sea and their seasonal modulations[J]
CHEN G X, GAN J P, XIE Q, et al. Eddy heat and salt transports in the South China Sea and their seasonal modulations[J]. Journal of Geophysical Research: Oceans, 2012, 117: C05021. DOI: 10.1029/2011JC007724
-
[6]
CHEN G X, HOU Y J, CHU X Q. Mesoscale eddies in the South China Sea: mean properties, spatiotemporal variability, and impact on thermohaline structure[J]. Journal of Geophysical Research: Oceans, 2011, 116: C06018. DOI: 10.1029/2010JC006716
-
[7]
Forecasting the eddying ocean with a deep neural network[J]
CUI Y , WU R, ZHANG X, et al. Forecasting the eddying ocean with a deep neural network[J]. Nature Communications, 2025, 16(1): 2268
2025
-
[8]
Operational multivariate ocean data assimilation[J]
CUMMINGS J A. Operational multivariate ocean data assimilation[J]. Quarterly Journal of the Royal Meteorological Society, 2005, 131(613): 3583-3604
2005
-
[9]
Variational data assimilation for the global ocean[M]//LEWIS J M, NA VON I M, ZUPANSKI M, et al
CUMMINGS J A, SMEDSTAD O M. Variational data assimilation for the global ocean[M]//LEWIS J M, NA VON I M, ZUPANSKI M, et al. Data assimilation for atmospheric, oceanic and hydrologic applications(V ol II). Berlin, Heidelberg: Springer, 2013: 303-343
2013
-
[10]
Dai and Trenberth Global River Flow and Continental Discharge Dataset[DS]
DAI A. Dai and Trenberth Global River Flow and Continental Discharge Dataset[DS]. Research Data Archive at the National Center for Atmospheric Research, Computational and Information Systems Laboratory (2017) [2020 -12- 25]. https://doi.org/10.5065/D6V69H1T
-
[11]
Enhanced turbulence and energy dissipation at ocean fronts[J]
D’ASARO E, LEE C, RAINVILLE L, et al. Enhanced turbulence and energy dissipation at ocean fronts[J]. Science, 2011, 332(6027): 318 -322. DOI: 10.1126/science.1201515
-
[12]
Efficient inverse modeling of barotropic ocean tides[J]
EGBERT G D, EROFEEV A S Y . Efficient inverse modeling of barotropic ocean tides[J]. Journal of Atmospheric and Oceanic Technology, 2002, 19(2): 183-204
2002
-
[13]
A survey of studies on the South China Sea upper ocean circulation[J]
FANG G, FANG W, FANG Y , et al. A survey of studies on the South China Sea upper ocean circulation[J]. Acta Oceanographica Taiwanica, 1998, 37: 1-16
1998
-
[14]
GUO P, FANG W, LIU C, et al. Seasonal characteristics of internal tides on the continental shelf in the northern South China Sea[J]. Journal of Geophysical Research, 2012, 117: C04023. DOI: 10.1029/2011JC007215
-
[15]
A review on the currents in the South China Sea: seasonal circulation, South China Sea warm current and Kuroshio intrusion[J]
HU J, KAWAMURA H, HONG H, et al. A review on the currents in the South China Sea: seasonal circulation, South China Sea warm current and Kuroshio intrusion[J]. Journal of Oceanography, 2000, 56: 607-624
2000
-
[16]
The Copernicus global 1/12° oceanic and sea ice GLORYS12 reanalysis[J]
LELLouche J M, GREINER E, BOURDALLÉ-BADIE R, et al. The Copernicus global 1/12° oceanic and sea ice GLORYS12 reanalysis[J] . Frontiers in Earth Science, 2021, 9: 698876
2021
-
[17]
Eddy characteristics in the northern South China Sea inferred from Lagrangian drifter data[J]
LI J X, ZHANG R, JIN B G. Eddy characteristics in the northern South China Sea inferred from Lagrangian drifter data[J]. Ocean Science, 2011, 7: 661-669
2011
-
[18]
Recent progress in studies of the South China Sea circulation[J]
LIU Q, KANEKO A, JILAN S. Recent progress in studies of the South China Sea circulation[J]. Journal of Oceanography, 2008, 64: 753-762
2008
-
[19]
The Copernicus Global 1/12° Oceanic and Sea Ice GLORYS12 Reanalysis[J]
LELLOUCHE J M, GREINER E, BOURDALLE -BADIE R, et al. The Copernicus Global 1/12° Oceanic and Sea Ice GLORYS12 Reanalysis[J]. Frontiers in Earth Science, 2021, 9: 698876. DOI: 10.3389/feart.2021.698876
-
[20]
The variability of internal tides in the northern South China Sea[J]
MA B B, LIEN R C, KO D S. The variability of internal tides in the northern South China Sea[J]. Journal of Oceanography, 2013, 69: 619 -630. DOI: 10.1007/s10872-013-0198-0
-
[21]
A three-dimensional temperature and salinity reconstruction system in the South China Sea[J]
WEN M Q, QING C X, FANG Y Y , et al. A three-dimensional temperature and salinity reconstruction system in the South China Sea[J]. Journal of Tropical Oceanography, 2013. DOI: 10.3969/j.issn.1009-5470.2013.06.001
-
[22]
MISHONOV A V , BOYER T P, BARANOV A O K, et al. World Ocean Database 2023[DS]. NOAA Atlas NESDIS 97. Silver Spring, MD: NOAA National Centers for Environmental Information, 2024: 206. DOI: 10.25923/z885-h264
-
[23]
Assimilation of high - resolution sea surface temperature data into an operational nowcast/forecast system around Japan using a multi -scale three-dimensional variational scheme[J]
MIYAZAWA Y , V ARLAMOV S M, MIYAMA T, et al. Assimilation of high - resolution sea surface temperature data into an operational nowcast/forecast system around Japan using a multi -scale three-dimensional variational scheme[J]. Ocean Dynamics, 2017, 67: 713-728
2017
-
[24]
Learning skillful medium-range global weather forecasting[J]
LAM R, SANCHEZ -GONZALEZ A, WILLSON M, et al. Learning skillful medium-range global weather forecasting[J]. Science, 2023: eadi2336
2023
-
[25]
World Ocean Atlas 2023[DS]
REAGAN J R, BOYER T P, GARCÍ A H E, et al. World Ocean Atlas 2023[DS]. NOAA National Centers for Environmental Information, 2024. Dataset: NCEI Accession 0270533
2023
-
[26]
A time -split nonhydrostatic atmospheric model for weather research and fore casting applications[J]
SKAMAROCK W C, KLEMP J B. A time -split nonhydrostatic atmospheric model for weather research and fore casting applications[J]. Journal of Computational Physics, 2008, 227: 3465-3485
2008
-
[27]
An operational real- time eddy -resolving 1/16° global ocean nowcast/forecast system[J]
SMEDSTAD O M, HURLBURT H E, METZGER E J, et al. An operational real- time eddy -resolving 1/16° global ocean nowcast/forecast system[J]. Journal of Marine Systems, 2003, 40: 341-361
2003
-
[28]
Upper-layer circulation in the South China Sea[J]
QU T D. Upper-layer circulation in the South China Sea[J]. Journal of Physical Oceanography, 2000, 30: 1450-1460
2000
-
[29]
SONG T, HAN N, ZHU Y , et al. Application of deep learning technique to the sea surface height prediction in the South China Sea[J]. A cta Oceanologica Sinica, 2021, 40(7): 1-9. DOI: 10.1007/s13131-021-1735-0
-
[30]
SONG Y T. Estimation of interbasin transport using ocean bottom pressure: theory and model for Asian marginal seas[J]. Journal of Geophysical Research, 2006, 111: C11S19. DOI: 10.1029/2005JC003189
-
[32]
Progress of regional oceanography study associated with western boundary current in the South China Sea[J]
WANG D X, LIU Q Y , XIE Q, et al. Progress of regional oceanography study associated with western boundary current in the South China Sea[J]. Chinese Science Bulletin, 2013, 58: 1205–1215
2013
-
[33]
WANG X, WANG R, HU N, et al. Xihe: A data -driven model for global ocean eddy-resolving forecasting[PP]. Beijing: arXiv, 2024. https://arxiv.org/abs/2402.02995
-
[34]
Ai -goms: Large ai -driven global ocean modeling system[PP]
XIONG W, XIANG Y , WU H, et al. Ai -goms: Large ai -driven global ocean modeling system[PP]. Beijing: arXiv, 2023. https://arxiv.org/abs/2308.03152
-
[35]
South China Sea throughflow as evidenced by satellite images and numerical experiments[J]
YU Z, SHE N S, McCREARY J P, et al. South China Sea throughflow as evidenced by satellite images and numerical experiments[J]. Geophysical Research Letters, 2007, 34: L01601. https://doi.org/10.1029/2006GL028103
-
[36]
User manual of thr ee-dimensional grid dataset (GDCSM_Argo)[DS]
ZHANG C L, XU J P, LIU Z H, et al. User manual of thr ee-dimensional grid dataset (GDCSM_Argo)[DS]. Hangzhou: China Argo Real-time Data Center, 2018. http://www.argo.org.cn/index.php?m=content&c=index&f=lists&catid=32
2018
-
[37]
Internal tide radiation from the Luzon Strait[J]
ZHAO Z. Internal tide radiation from the Luzon Strait[J]. Journal of Geophysical Research: Oceans, 2014, 119: 5434–5448. https://doi.org/10.1002/2014JC010014
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.