pith. sign in

arxiv: 2605.24032 · v1 · pith:DFWQI46Unew · submitted 2026-05-21 · ⚛️ physics.ao-ph

Volador 1.0: A Data-Driven Air-Sea Full-Coupling Regional Forecast Model with Submesoscale-Permitting Based on MOE-Swin-Transformer Framework

Pith reviewed 2026-06-30 16:32 UTC · model grok-4.3

classification ⚛️ physics.ao-ph
keywords data-driven ocean forecastingair-sea couplingSwin-Transformersubmesoscale processesMixture-of-ExpertsSouth China Searegional forecast model
0
0 comments X

The pith

Volador 1.0 produces 72-hour ocean forecasts in the South China Sea with errors at or below those of reanalysis products and ROMS while resolving submesoscale features.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces Volador 1.0, a data-driven regional model for air-sea coupled forecasting in the South China Sea. It combines a Swin-Transformer with Mixture-of-Experts, cross-grid bidirectional attention in latent space, and a fast-slow dual-branch design to handle full momentum and heat flux exchanges between atmosphere and ocean. Three-month hindcast and 15-day real-time tests show that its predictions of temperature and salinity in the upper 500 meters plus sea surface height match or improve on the accuracy of REDOS V2.0, GLORYS12, and the ROMS numerical model. The same runs reproduce the sub-to-mesoscale energy cascade expected from classical turbulence theory. These results indicate that transformer-based architectures can deliver accurate, fine-scale marine forecasts at speeds suitable for operational use.

Core claim

Volador 1.0 demonstrates that an MoE-Swin-Transformer framework with air-sea full-coupling, Cross-Grid Bidirectional Cross-Attention, and fast-slow dual-branch architecture can deliver 0-72h forecasts of temperature and salinity in the 0-500m layer and sea surface height whose RMSE or MAE is smaller than or at least comparable to those from REDOS V2.0, GLORYS12, and ROMS, while its energy spectrum captures the sub- to mesoscale cascade predicted by classical turbulence theory.

What carries the argument

Mixture-of-Experts Swin-Transformer with latent-space interaction via Cross-Grid Bidirectional Cross-Attention and fast-slow dual-branch architecture that incorporates air-sea full-coupling of momentum and heat fluxes.

If this is right

  • Air-sea full-coupling measurably improves forecast skill relative to the non-coupled version of the same architecture.
  • The model reproduces submesoscale processes including internal waves in its forecasts.
  • Forecasts can be produced faster than with traditional numerical models at comparable accuracy.
  • The approach supports operational real-time use for disaster prevention in the tested region.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If the architecture transfers, comparable models could be trained for other coastal basins using local reanalysis data.
  • Matching the expected turbulence energy spectrum suggests the model has learned scale-interaction physics rather than purely statistical correlations.
  • Wider adoption would lower the computational barrier to high-resolution, frequently updated marine forecasts.

Load-bearing premise

The three-month hindcast and 15-day real-time test results in the South China Sea will generalize beyond those specific conditions and periods.

What would settle it

Independent observations or model runs in a new time period or different coastal region where Volador 1.0 errors for temperature, salinity or height exceed those of reanalysis or ROMS by a clear margin.

read the original abstract

A data-driven air-sea full-coupling regional forecast model with submesoscale-permitting, named "Volador 1.0", is developed for the South China Sea (SCS). The model features a Swin-Transformer framework integrated with a Mixture-of-Experts (MoE) system, a latent space interaction architecture based on Cross-Grid Bidirectional Cross-Attention, and a fast-slow dual-branch architecture. Both the three-month hindcast test and the 15-day operational real-time forecasting demonstrate that Volador 1.0 has a very encouraging and promising performance in 0-72h forecasting of temperature and salinity in the 0-500m upper ocean as well as the sea surface height with root-mean-square-error (RMSE) or mean absolute error (MAE) smaller than or at least comparable to those from the reanalysis datasets REDOS V2.0 and GLORYS12 and the state-of-the-art regional numerical model Regional Ocean Modeling System (ROMS). In particular, Volador 1.0 demonstrates its capability of capturing/forecasting submesoscale processes including internal waves, with an energy spectrum well representing sub- to mesoscale energy cascade as expected by the classical turbulence theory. Further analysis based on ablation experiments shows that the air-sea full-coupling framework, which takes into account the dynamic exchanges of momentum and heat fluxes between the atmosphere and the ocean, indeed helps improve the model's performance compared to the non-full-coupling one. Volador 1.0, though still subject to refinement in the coming future with a large space for improvement, blazes a path for an accurate, fine and fast marine environment forecasting, and thus could help promote our capability of disaster prevention and mitigation in the SCS as well as in other coastal regions where these innovative techniques can be applied.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper presents Volador 1.0, a data-driven regional ocean forecast model for the South China Sea based on a Mixture-of-Experts Swin-Transformer architecture with cross-grid bidirectional cross-attention and a fast-slow dual-branch design. It claims that 0-72 h forecasts of temperature and salinity (0-500 m) and sea-surface height achieve RMSE or MAE values smaller than or at least comparable to the reanalysis products REDOS V2.0 and GLORYS12 as well as the ROMS numerical model. The model is further asserted to capture submesoscale processes, including internal waves, with an energy spectrum that reproduces the expected sub- to mesoscale energy cascade from classical turbulence theory. Ablation experiments are said to demonstrate that the full air-sea coupling improves performance relative to a non-coupled variant.

Significance. If the performance and submesoscale claims are supported by rigorous, reproducible validation, the work would constitute a notable step toward operational data-driven regional forecasting at submesoscale-permitting resolution. Demonstrating that an MoE-Transformer with explicit air-sea flux coupling can match or exceed established reanalysis and numerical models in a dynamically complex basin would have direct implications for computational efficiency in marine forecasting and disaster mitigation applications.

major comments (3)
  1. [Abstract] Abstract: The central performance claims (RMSE/MAE smaller than or comparable to REDOS V2.0, GLORYS12, and ROMS) are stated without any numerical values, error bars, tables, or description of data splits, training procedures, or cross-validation strategy. This information is load-bearing for the claim that the model generalizes rather than overfits the 3-month hindcast + 15-day real-time SCS window.
  2. [Results] Results / ablation section: The statement that the air-sea full-coupling framework improves performance is presented without quantitative deltas, control experiments that isolate coupling from other architectural choices, or statistical tests. Without these, the attribution of gains specifically to momentum and heat flux exchanges cannot be evaluated.
  3. [Results] Submesoscale analysis: The claim that the energy spectrum 'well represents' the sub- to mesoscale cascade 'as expected by classical turbulence theory' lacks any reference to expected spectral slopes (e.g., k^{-5/3} or k^{-3}), wavenumber ranges, or direct comparison figures/tables against theory or observations. This is load-bearing for the submesoscale-permitting assertion.
minor comments (2)
  1. [Title/Abstract] The title uses 'MOE-Swin-Transformer' while the abstract expands 'Mixture-of-Experts (MoE)'; consistent acronym usage and expansion on first use would improve clarity.
  2. [Abstract] The abstract refers to both 'hindcast test' and 'operational real-time forecasting' without clarifying whether the 15-day period is truly out-of-sample or drawn from the same reanalysis used for training.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive comments that highlight opportunities to strengthen the presentation of quantitative evidence. We address each major point below and will revise the manuscript accordingly.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The central performance claims (RMSE/MAE smaller than or comparable to REDOS V2.0, GLORYS12, and ROMS) are stated without any numerical values, error bars, tables, or description of data splits, training procedures, or cross-validation strategy. This information is load-bearing for the claim that the model generalizes rather than overfits the 3-month hindcast + 15-day real-time SCS window.

    Authors: We agree that the abstract would be strengthened by including specific numerical values. In the revised manuscript we will insert representative RMSE/MAE figures for temperature, salinity, and SSH over the 0-72 h horizon, together with a brief statement on the hindcast/real-time split and validation approach. Full details of data partitioning, training, and cross-validation remain in the Methods section, but the abstract will now supply the quantitative context needed to evaluate generalization. revision: yes

  2. Referee: [Results] Results / ablation section: The statement that the air-sea full-coupling framework improves performance is presented without quantitative deltas, control experiments that isolate coupling from other architectural choices, or statistical tests. Without these, the attribution of gains specifically to momentum and heat flux exchanges cannot be evaluated.

    Authors: We accept that quantitative deltas and clearer isolation of the coupling contribution are required. The revised ablation section will report explicit performance differences (RMSE/MAE reductions) between the full air-sea coupling configuration and the non-coupled control, describe how the control experiments hold other architectural elements fixed, and include statistical significance tests. These additions will allow direct evaluation of the role of momentum and heat flux exchanges. revision: yes

  3. Referee: [Results] Submesoscale analysis: The claim that the energy spectrum 'well represents' the sub- to mesoscale cascade 'as expected by classical turbulence theory' lacks any reference to expected spectral slopes (e.g., k^{-5/3} or k^{-3}), wavenumber ranges, or direct comparison figures/tables against theory or observations. This is load-bearing for the submesoscale-permitting assertion.

    Authors: We agree that explicit theoretical benchmarks and direct comparisons are necessary. The revised submesoscale section will cite the expected spectral slopes (k^{-5/3} in the inertial subrange and k^{-3} at larger scales), specify the wavenumber ranges corresponding to submesoscale and mesoscale regimes, and add figures or tables that overlay the model spectrum against both theoretical lines and available observational references. This will provide the rigorous support required for the submesoscale-permitting claim. revision: yes

Circularity Check

0 steps flagged

No significant circularity; performance claims rest on empirical test-set evaluation against external benchmarks

full rationale

The manuscript presents an ML architecture (MOE-Swin-Transformer with cross-grid attention and dual-branch design) and reports RMSE/MAE on a 3-month hindcast plus 15-day real-time test for SCS variables, plus energy-spectrum fidelity and ablation results for the coupling component. No equations, derivations, or first-principles results appear; the paper contains no self-definitional relations, fitted parameters renamed as predictions, or load-bearing self-citations. All quantitative claims are framed as direct comparisons to independent reanalysis products (REDOS V2.0, GLORYS12) and the ROMS numerical model, with no indication that test metrics were used in training or that any result reduces to its own inputs by construction. Generalization limits are a validity concern, not a circularity issue.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract provides no information on free parameters, axioms, or invented entities; review limited to abstract only.

pith-pipeline@v0.9.1-grok · 5912 in / 1247 out tokens · 50237 ms · 2026-06-30T16:32:43.045452+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

36 extracted references · 16 canonical work pages

  1. [1]

    Glonet: Mercator’s end -to-end neural forecasting system[PP/OL]

    AOUNI A E, GAUDEL Q, REGNIER C, et al. Glonet: Mercator’s end -to-end neural forecasting system[PP/OL]. arXiv:2412.05454 (2024) [2026 -05-06]. https://arxiv.org/abs/2412.05454

  2. [2]

    Accurate medium -range global weather forecasting with 3d neural networks[J]

    BI K, XIE L, ZHANG H, et al. Accurate medium -range global weather forecasting with 3d neural networks[J]. Nature, 2023: 1-6

  3. [3]

    Geostrophic turbulence

    CHARNEY J G. Geostrophic turbulence. Journal of the Atmospheric Sciences, 1971, 28: 1087-1095

  4. [4]

    The HYCOM (HYbrid Coordinate Ocean Model) data assimilative system[J]

    CHASSIGNET E P, HURLBURT H E, SMEDSTAD O M, et al. The HYCOM (HYbrid Coordinate Ocean Model) data assimilative system[J]. Journal of Marine Systems, 2007, 65: 60-83

  5. [5]

    Eddy heat and salt transports in the South China Sea and their seasonal modulations[J]

    CHEN G X, GAN J P, XIE Q, et al. Eddy heat and salt transports in the South China Sea and their seasonal modulations[J]. Journal of Geophysical Research: Oceans, 2012, 117: C05021. DOI: 10.1029/2011JC007724

  6. [6]

    Mesoscale eddies in the South China Sea: mean properties, spatiotemporal variability, and impact on thermohaline structure[J]

    CHEN G X, HOU Y J, CHU X Q. Mesoscale eddies in the South China Sea: mean properties, spatiotemporal variability, and impact on thermohaline structure[J]. Journal of Geophysical Research: Oceans, 2011, 116: C06018. DOI: 10.1029/2010JC006716

  7. [7]

    Forecasting the eddying ocean with a deep neural network[J]

    CUI Y , WU R, ZHANG X, et al. Forecasting the eddying ocean with a deep neural network[J]. Nature Communications, 2025, 16(1): 2268

  8. [8]

    Operational multivariate ocean data assimilation[J]

    CUMMINGS J A. Operational multivariate ocean data assimilation[J]. Quarterly Journal of the Royal Meteorological Society, 2005, 131(613): 3583-3604

  9. [9]

    Variational data assimilation for the global ocean[M]//LEWIS J M, NA VON I M, ZUPANSKI M, et al

    CUMMINGS J A, SMEDSTAD O M. Variational data assimilation for the global ocean[M]//LEWIS J M, NA VON I M, ZUPANSKI M, et al. Data assimilation for atmospheric, oceanic and hydrologic applications(V ol II). Berlin, Heidelberg: Springer, 2013: 303-343

  10. [10]

    Dai and Trenberth Global River Flow and Continental Discharge Dataset[DS]

    DAI A. Dai and Trenberth Global River Flow and Continental Discharge Dataset[DS]. Research Data Archive at the National Center for Atmospheric Research, Computational and Information Systems Laboratory (2017) [2020 -12- 25]. https://doi.org/10.5065/D6V69H1T

  11. [11]

    Enhanced turbulence and energy dissipation at ocean fronts[J]

    D’ASARO E, LEE C, RAINVILLE L, et al. Enhanced turbulence and energy dissipation at ocean fronts[J]. Science, 2011, 332(6027): 318 -322. DOI: 10.1126/science.1201515

  12. [12]

    Efficient inverse modeling of barotropic ocean tides[J]

    EGBERT G D, EROFEEV A S Y . Efficient inverse modeling of barotropic ocean tides[J]. Journal of Atmospheric and Oceanic Technology, 2002, 19(2): 183-204

  13. [13]

    A survey of studies on the South China Sea upper ocean circulation[J]

    FANG G, FANG W, FANG Y , et al. A survey of studies on the South China Sea upper ocean circulation[J]. Acta Oceanographica Taiwanica, 1998, 37: 1-16

  14. [14]

    Seasonal characteristics of internal tides on the continental shelf in the northern South China Sea[J]

    GUO P, FANG W, LIU C, et al. Seasonal characteristics of internal tides on the continental shelf in the northern South China Sea[J]. Journal of Geophysical Research, 2012, 117: C04023. DOI: 10.1029/2011JC007215

  15. [15]

    A review on the currents in the South China Sea: seasonal circulation, South China Sea warm current and Kuroshio intrusion[J]

    HU J, KAWAMURA H, HONG H, et al. A review on the currents in the South China Sea: seasonal circulation, South China Sea warm current and Kuroshio intrusion[J]. Journal of Oceanography, 2000, 56: 607-624

  16. [16]

    The Copernicus global 1/12° oceanic and sea ice GLORYS12 reanalysis[J]

    LELLouche J M, GREINER E, BOURDALLÉ-BADIE R, et al. The Copernicus global 1/12° oceanic and sea ice GLORYS12 reanalysis[J] . Frontiers in Earth Science, 2021, 9: 698876

  17. [17]

    Eddy characteristics in the northern South China Sea inferred from Lagrangian drifter data[J]

    LI J X, ZHANG R, JIN B G. Eddy characteristics in the northern South China Sea inferred from Lagrangian drifter data[J]. Ocean Science, 2011, 7: 661-669

  18. [18]

    Recent progress in studies of the South China Sea circulation[J]

    LIU Q, KANEKO A, JILAN S. Recent progress in studies of the South China Sea circulation[J]. Journal of Oceanography, 2008, 64: 753-762

  19. [19]

    The Copernicus Global 1/12° Oceanic and Sea Ice GLORYS12 Reanalysis[J]

    LELLOUCHE J M, GREINER E, BOURDALLE -BADIE R, et al. The Copernicus Global 1/12° Oceanic and Sea Ice GLORYS12 Reanalysis[J]. Frontiers in Earth Science, 2021, 9: 698876. DOI: 10.3389/feart.2021.698876

  20. [20]

    The variability of internal tides in the northern South China Sea[J]

    MA B B, LIEN R C, KO D S. The variability of internal tides in the northern South China Sea[J]. Journal of Oceanography, 2013, 69: 619 -630. DOI: 10.1007/s10872-013-0198-0

  21. [21]

    A three-dimensional temperature and salinity reconstruction system in the South China Sea[J]

    WEN M Q, QING C X, FANG Y Y , et al. A three-dimensional temperature and salinity reconstruction system in the South China Sea[J]. Journal of Tropical Oceanography, 2013. DOI: 10.3969/j.issn.1009-5470.2013.06.001

  22. [22]

    World Ocean Database 2023[DS]

    MISHONOV A V , BOYER T P, BARANOV A O K, et al. World Ocean Database 2023[DS]. NOAA Atlas NESDIS 97. Silver Spring, MD: NOAA National Centers for Environmental Information, 2024: 206. DOI: 10.25923/z885-h264

  23. [23]

    Assimilation of high - resolution sea surface temperature data into an operational nowcast/forecast system around Japan using a multi -scale three-dimensional variational scheme[J]

    MIYAZAWA Y , V ARLAMOV S M, MIYAMA T, et al. Assimilation of high - resolution sea surface temperature data into an operational nowcast/forecast system around Japan using a multi -scale three-dimensional variational scheme[J]. Ocean Dynamics, 2017, 67: 713-728

  24. [24]

    Learning skillful medium-range global weather forecasting[J]

    LAM R, SANCHEZ -GONZALEZ A, WILLSON M, et al. Learning skillful medium-range global weather forecasting[J]. Science, 2023: eadi2336

  25. [25]

    World Ocean Atlas 2023[DS]

    REAGAN J R, BOYER T P, GARCÍ A H E, et al. World Ocean Atlas 2023[DS]. NOAA National Centers for Environmental Information, 2024. Dataset: NCEI Accession 0270533

  26. [26]

    A time -split nonhydrostatic atmospheric model for weather research and fore casting applications[J]

    SKAMAROCK W C, KLEMP J B. A time -split nonhydrostatic atmospheric model for weather research and fore casting applications[J]. Journal of Computational Physics, 2008, 227: 3465-3485

  27. [27]

    An operational real- time eddy -resolving 1/16° global ocean nowcast/forecast system[J]

    SMEDSTAD O M, HURLBURT H E, METZGER E J, et al. An operational real- time eddy -resolving 1/16° global ocean nowcast/forecast system[J]. Journal of Marine Systems, 2003, 40: 341-361

  28. [28]

    Upper-layer circulation in the South China Sea[J]

    QU T D. Upper-layer circulation in the South China Sea[J]. Journal of Physical Oceanography, 2000, 30: 1450-1460

  29. [29]

    Application of deep learning technique to the sea surface height prediction in the South China Sea[J]

    SONG T, HAN N, ZHU Y , et al. Application of deep learning technique to the sea surface height prediction in the South China Sea[J]. A cta Oceanologica Sinica, 2021, 40(7): 1-9. DOI: 10.1007/s13131-021-1735-0

  30. [30]

    Estimation of interbasin transport using ocean bottom pressure: theory and model for Asian marginal seas[J]

    SONG Y T. Estimation of interbasin transport using ocean bottom pressure: theory and model for Asian marginal seas[J]. Journal of Geophysical Research, 2006, 111: C11S19. DOI: 10.1029/2005JC003189

  31. [32]

    Progress of regional oceanography study associated with western boundary current in the South China Sea[J]

    WANG D X, LIU Q Y , XIE Q, et al. Progress of regional oceanography study associated with western boundary current in the South China Sea[J]. Chinese Science Bulletin, 2013, 58: 1205–1215

  32. [33]

    XiHe: A data-driven model for global ocean eddy-resolving forecasting.arXiv preprint arXiv:2402.02995, 2024

    WANG X, WANG R, HU N, et al. Xihe: A data -driven model for global ocean eddy-resolving forecasting[PP]. Beijing: arXiv, 2024. https://arxiv.org/abs/2402.02995

  33. [34]

    Ai -goms: Large ai -driven global ocean modeling system[PP]

    XIONG W, XIANG Y , WU H, et al. Ai -goms: Large ai -driven global ocean modeling system[PP]. Beijing: arXiv, 2023. https://arxiv.org/abs/2308.03152

  34. [35]

    South China Sea throughflow as evidenced by satellite images and numerical experiments[J]

    YU Z, SHE N S, McCREARY J P, et al. South China Sea throughflow as evidenced by satellite images and numerical experiments[J]. Geophysical Research Letters, 2007, 34: L01601. https://doi.org/10.1029/2006GL028103

  35. [36]

    User manual of thr ee-dimensional grid dataset (GDCSM_Argo)[DS]

    ZHANG C L, XU J P, LIU Z H, et al. User manual of thr ee-dimensional grid dataset (GDCSM_Argo)[DS]. Hangzhou: China Argo Real-time Data Center, 2018. http://www.argo.org.cn/index.php?m=content&c=index&f=lists&catid=32

  36. [37]

    Internal tide radiation from the Luzon Strait[J]

    ZHAO Z. Internal tide radiation from the Luzon Strait[J]. Journal of Geophysical Research: Oceans, 2014, 119: 5434–5448. https://doi.org/10.1002/2014JC010014