arxiv: 2605.00860 · v1 · submitted 2026-04-21 · ⚛️ physics.ao-ph · cs.LG

Recognition: unknown

An Adaptive Spatiotemporal Clustering Framework for 3D Ocean Subsurface Temperature Reconstruction

Hailiang Cheng, Jihong Guan, Ming Shan Loo, Wengen Li, Xudong Jiang, Yichao Zhang, Zhifei Zhang

Pith reviewed 2026-05-10 01:23 UTC · model grok-4.3

classification ⚛️ physics.ao-ph cs.LG

keywords ocean subsurface temperature reconstructionspatiotemporal clusteringdeep learningsatellite remote sensing3D ocean fieldsclimate variabilitysea surface data

0 comments

The pith

An adaptive spatiotemporal clustering framework enables more accurate reconstruction of global ocean subsurface temperatures using only surface satellite data.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces an adaptive framework that applies spatiotemporal clustering to group ocean locations sharing similar vertical temperature structures and temporal patterns. This grouping is then used to train specialized deep learning models, including convolutional networks and vision transformers, for reconstructing three-dimensional subsurface temperature fields from surface measurements alone. The approach addresses the challenges of data scarcity and heterogeneity in ocean processes. Results from experiments show these enhanced models reduce root mean square errors by between 12.4 and 27.2 percent compared to standard versions. A reader would care because accurate subsurface temperature data supports better models of ocean circulation and climate change.

Core claim

The authors claim that incorporating an adaptive spatiotemporal clustering step into deep learning pipelines allows for the accurate global reconstruction of ocean subsurface temperature fields at depth using only sea surface temperature, salinity, height, and wind observations, with the clustering capturing the necessary vertical dependencies and temporal variations to achieve RMSE reductions of 12.4% to 27.2% across tested models such as DP-CNN, Attention U-Net, and ViT.

What carries the argument

The adaptive spatiotemporal clustering framework that partitions the global ocean into regions based on similar vertical structural dependencies and temporal variation patterns of subsurface temperature, enabling the application of region-specific deep learning models.

If this is right

The framework improves the generalization of deep learning models across global ocean scales.
Reconstructed temperature fields provide better inputs for meteorological modeling and climate change assessment.
Only surface observations are needed for full 3D reconstruction, reducing reliance on sparse subsurface measurements.
Multiple deep learning architectures benefit from the clustering approach.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the clustering proves robust, similar adaptive grouping could be applied to reconstruct other ocean variables like currents or nutrients from surface data.
The method may support higher-resolution reconstructions in data-sparse regions by leveraging pattern similarities.
Integration with real-time satellite data streams could enable dynamic updates to ocean temperature maps.

Load-bearing premise

The clustering step must reliably identify groups that share physical dependencies in temperature profiles and variations, rather than producing partitions that fail to improve model accuracy or generalization.

What would settle it

Running the deep learning models with and without the adaptive clustering on a new global dataset or in a different time period, and observing whether the reported RMSE improvements disappear or reverse, would test the central claim.

Figures

Figures reproduced from arXiv: 2605.00860 by Hailiang Cheng, Jihong Guan, Ming Shan Loo, Wengen Li, Xudong Jiang, Yichao Zhang, Zhifei Zhang.

**Figure 2.** Figure 2: Two study areas in the Indian Ocean and South China Sea, respectively. [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

**Figure 3.** Figure 3: Non-uniform vertical stratification of the subsurface temperature datasets. The [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗

**Figure 4.** Figure 4: Overview of the proposed framework which consists of two stages, i.e., spa [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗

**Figure 5.** Figure 5: Comparison of reconstruction errors before and after applying the proposed [PITH_FULL_IMAGE:figures/full_fig_p011_5.png] view at source ↗

**Figure 6.** Figure 6: Vertical clustering results in the South China Sea. (a–b) Temperature section [PITH_FULL_IMAGE:figures/full_fig_p012_6.png] view at source ↗

**Figure 7.** Figure 7: Temporal clustering results dividing the typical annual cycle into sub-periods [PITH_FULL_IMAGE:figures/full_fig_p012_7.png] view at source ↗

**Figure 8.** Figure 8: Layer-wise RMSE under different clustering strategies in the South China Sea. [PITH_FULL_IMAGE:figures/full_fig_p013_8.png] view at source ↗

**Figure 9.** Figure 9: Vertical profiles of reconstructed temperature fields within the upper 1500 me [PITH_FULL_IMAGE:figures/full_fig_p014_9.png] view at source ↗

**Figure 10.** Figure 10: Detailed comparison of reconstructed temperature fields within the upper 350 [PITH_FULL_IMAGE:figures/full_fig_p014_10.png] view at source ↗

**Figure 11.** Figure 11: Absolute error distributions at depths of 50 m, 100 m, 150 m, and 200 m. [PITH_FULL_IMAGE:figures/full_fig_p015_11.png] view at source ↗

read the original abstract

The reconstruction of ocean subsurface temperature (OST) using satellite remote sensing data holds significant scientific value for advancing the understanding of ocean dynamics and climate variability. However, the scarcity of subsurface observations, combined with the high degree of nonlinearity and spatiotemporal heterogeneity in subsurface processes, poses substantial challenges to the accuracy and generalization capability of traditional reconstruction methods. To address these limitations, this study proposes an adaptive framework that could capture both vertical structural dependencies and temporal variation patterns of OST via spatio-temporal clustering. By incorporating this framework with various deep learning models, e.g., dual-path convolutional neural networks (DP-CNN), Attention U-Net, and Vision Transformer (ViT), the OST field can be accurately reconstructed at a global scale only using surface observations, i.e., sea surface temperature (SST), sea surface salinity (SSS), sea surface height (SSH), and sea surface wind (SSW). Experimental results demonstrate that multiple deep learning methods using the proposed framework largely outperform their original counterparts, yielding improvements in RMSE ranging from 12.4\% to 27.2\%. This study provides a reliable solution for subsurface temperature reconstruction, offering important implications for meteorological modeling and climate change assessment.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The adaptive clustering helps DL models reconstruct ocean subsurface temps better according to the abstract, but without details on the clustering process or validation, the gains are difficult to evaluate.

read the letter

The punchline is that this work combines adaptive spatiotemporal clustering with several deep learning architectures to reconstruct global ocean subsurface temperature fields from surface measurements, claiming RMSE improvements of 12.4 to 27.2 percent over the baseline models. What the paper does is take the challenge of nonlinear and heterogeneous subsurface processes and propose clustering to group similar vertical and temporal patterns. Then it applies this to DP-CNN, Attention U-Net, and ViT. Showing consistent gains across these different models is a reasonable way to demonstrate that the framework adds value. The focus on using only SST, SSS, SSH, and SSW for global reconstruction is practical for areas where in-situ data is limited. The paper does well in identifying a clear application area and testing the idea on multiple networks rather than just one. The soft spots center on the experimental validation. The abstract states the improvements but provides no information about the underlying data, how the adaptive clustering is implemented in practice, the splitting of data for training and testing, or any measures of uncertainty. This makes it impossible to assess if the results are robust. The stress-test note points to a potential issue with data leakage if the clustering step uses information from the full dataset, including periods or locations that should be held out. In spatiotemporal settings like this, that is a common pitfall, and if the paper does not explicitly avoid it by computing clusters only on training data, the generalization claims would be weakened. Since no equations or formal derivations are highlighted, the contribution is empirical, so the evidence needs to be solid. This paper would appeal to researchers working on ocean remote sensing, climate data assimilation, and machine learning applications in geosciences. A reader interested in improving reconstruction accuracy for subsurface variables could find the framework adaptable to their own problems. It is worth sending to peer review. The core idea is straightforward enough that referees can provide targeted feedback on the methods and results sections, which appear to be the main areas needing more detail. Even with the current limitations in the abstract, the work has enough substance to benefit from external review rather than being rejected outright.

Referee Report

2 major / 1 minor

Summary. The manuscript proposes an adaptive spatiotemporal clustering framework to capture vertical structural dependencies and temporal variation patterns of ocean subsurface temperature (OST). The framework is combined with deep learning models (DP-CNN, Attention U-Net, ViT) to reconstruct global 3D OST fields from surface observations (SST, SSS, SSH, SSW) only, with experimental results claiming RMSE improvements of 12.4% to 27.2% over the baseline models.

Significance. If the reported gains prove robust and free of leakage, the framework would offer a practical way to improve accuracy and generalization in subsurface ocean temperature reconstruction, directly supporting better ocean dynamics understanding and climate variability studies. The integration of clustering with multiple DL architectures is a strength that could be adopted more broadly if the validation is rigorous.

major comments (2)

[Results / Experimental validation] The central empirical claim (RMSE gains of 12.4–27.2%) is presented without any description of data sources, temporal/spatial train-test splits, cross-validation procedure, error bars, or statistical significance testing. This absence leaves the headline result unverifiable and directly undermines the generalization claims across global scales.
[Methods / Clustering framework] The adaptive clustering step (described in the methods) risks data leakage if cluster centroids or assignments are derived from the full observation record rather than training data alone. In spatiotemporal reconstruction, using test-period surface fields or subsurface truth to form clusters would provide indirect supervision unavailable at inference, invalidating the reported improvements in generalization.

minor comments (1)

[Abstract] The abstract uses tentative language ('could capture') that should be aligned with the strength of the reported results.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed review. The comments highlight important aspects of experimental rigor and methodological transparency that we will address in the revision. Below we respond point by point to the major comments.

read point-by-point responses

Referee: [Results / Experimental validation] The central empirical claim (RMSE gains of 12.4–27.2%) is presented without any description of data sources, temporal/spatial train-test splits, cross-validation procedure, error bars, or statistical significance testing. This absence leaves the headline result unverifiable and directly undermines the generalization claims across global scales.

Authors: We agree that the initial manuscript did not provide sufficient detail on the experimental protocol, which is necessary for reproducibility and verification of the reported gains. In the revised version we will add a dedicated subsection in Methods describing: (i) exact data sources (EN4 subsurface temperature profiles, AVHRR SST, SMOS SSS, AVISO SSH, and ERA5 SSW, all interpolated to a common 1° grid); (ii) the temporal split (training 1993–2015, validation 2016–2018, test 2019–2020) together with spatial hold-out regions; (iii) the 5-fold temporal cross-validation procedure used to respect autocorrelation; (iv) error bars as standard deviation across the five folds; and (v) paired t-test p-values confirming statistical significance of the 12.4–27.2 % RMSE reductions relative to the baseline models. These additions will make the headline results fully verifiable. revision: yes
Referee: [Methods / Clustering framework] The adaptive clustering step (described in the methods) risks data leakage if cluster centroids or assignments are derived from the full observation record rather than training data alone. In spatiotemporal reconstruction, using test-period surface fields or subsurface truth to form clusters would provide indirect supervision unavailable at inference, invalidating the reported improvements in generalization.

Authors: We share the referee’s concern about data leakage in spatiotemporal settings. Our adaptive clustering is performed strictly on the training partition: cluster centroids are computed from training-set surface and (where available) subsurface fields only, and test samples are assigned to the nearest pre-computed training centroid using a distance metric that does not incorporate any test-period information. No subsurface truth from the test period is ever used. We will revise the Methods section to state this separation explicitly, add pseudocode that isolates the clustering step to the training phase, and include a short paragraph confirming that inference-time clustering uses only the fixed training centroids. This clarification will demonstrate that the reported generalization improvements remain valid. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical results from clustering + DL pipeline

full rationale

The paper describes an adaptive spatiotemporal clustering step followed by per-cluster training of DP-CNN, Attention U-Net, and ViT models, with performance reported as empirical RMSE gains on held-out data. No equations, first-principles derivations, or 'predictions' are presented that reduce by construction to fitted inputs or self-citations. The central claim rests on experimental comparisons rather than any self-definitional or fitted-input-called-prediction structure. Clustering details and data partitioning are methodological choices whose validity is external to any internal reduction; the reported improvements are framed as measured outcomes, not tautological outputs.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The approach rests on the domain assumption that surface variables contain sufficient information for subsurface reconstruction and that clustering can isolate meaningful spatiotemporal regimes; no free parameters or invented entities are explicitly named in the abstract.

axioms (2)

domain assumption Surface observations (SST, SSS, SSH, SSW) contain sufficient information to reconstruct subsurface temperature fields via learned mappings.
This is the core premise enabling the use of only satellite surface data for 3D reconstruction.
domain assumption Spatio-temporal heterogeneity in OST can be effectively partitioned by adaptive clustering to improve model performance.
Invoked to justify the clustering framework as a solution to nonlinearity and heterogeneity.

pith-pipeline@v0.9.0 · 5528 in / 1393 out tokens · 29547 ms · 2026-05-10T01:23:28.617462+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

43 extracted references · 4 canonical work pages · 1 internal anchor

[1]

Spatial-Temporal Data Mining for Ocean Science: Data, Methodologies, and Opportunities

Spatial-Temporal Data Mining for Ocean Science: Data, Methodologies, and Opportunities , author=. arXiv preprint arXiv:2307.10803 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[2]

Atmosphere , volume=

MuSTC: A Multi-Stage Spatio--Temporal Clustering Method for Uncovering the Regionality of Global SST , author=. Atmosphere , volume=. 2023 , publisher=

2023
[3]

Journal of Marine Systems , volume=

The HYCOM (hybrid coordinate ocean model) data assimilative system , author=. Journal of Marine Systems , volume=. 2007 , publisher=

2007
[4]

2021 , issn =

Super-resolution of subsurface temperature field from remote sensing observations based on machine learning , journal =. 2021 , issn =. doi:https://doi.org/10.1016/j.jag.2021.102440 , url =

work page doi:10.1016/j.jag.2021.102440 2021
[5]

Ocean Modelling , volume=

Impact of initial and lateral open boundary conditions in a Regional Indian Ocean Model on Bay of Bengal circulation , author=. Ocean Modelling , volume=. 2023 , publisher=

2023
[6]

Geoscientific Model Development Discussions , volume=

Data assimilation sensitivity experiments in the East Auckland Current system using 4D-Var , author=. Geoscientific Model Development Discussions , volume=. 2022 , publisher=

2022
[7]

Monthly weather review , volume=

A reanalysis of ocean climate using Simple Ocean Data Assimilation (SODA) , author=. Monthly weather review , volume=
[8]

Frontiers in Marine Science , volume=

CLOINet: ocean state reconstructions through remote-sensing, in-situ sparse observations and deep learning , author=. Frontiers in Marine Science , volume=. 2024 , publisher=

2024
[9]

Journal of Marine Science and Engineering , volume=

Reconstructing ocean subsurface temperature and salinity from sea surface information based on dual path convolutional neural networks , author=. Journal of Marine Science and Engineering , volume=. 2023 , publisher=

2023
[10]

Geophysical Research Letters , volume=

Estimation of ocean subsurface thermal structure from surface parameters: A neural network approach , author=. Geophysical Research Letters , volume=. 2004 , publisher=

2004
[11]

Journal of Geophysical Research: Oceans , volume=

Retrieving temperature anomaly in the global subsurface and deeper ocean from satellite observations , author=. Journal of Geophysical Research: Oceans , volume=. 2018 , publisher=

2018
[12]

Remote Sensing of Environment , volume=

Estimation of subsurface temperature anomaly in the Indian Ocean during recent global surface warming hiatus from satellite measurements: A support vector machine approach , author=. Remote Sensing of Environment , volume=. 2015 , publisher=

2015
[13]

Remote Sensing , volume=

Estimating subsurface thermohaline structure of the global ocean using surface remote sensing observations , author=. Remote Sensing , volume=. 2019 , publisher=

2019
[14]

Remote Sensing of Environment , volume=

Predicting subsurface thermohaline structure from remote sensing data based on long short-term memory neural networks , author=. Remote Sensing of Environment , volume=. 2021 , publisher=

2021
[15]

IEEE Transactions on Geoscience and Remote Sensing , volume=

Reconstruction of subsurface temperature field in the south China Sea from satellite observations based on an attention U-net model , author=. IEEE Transactions on Geoscience and Remote Sensing , volume=. 2022 , publisher=

2022
[16]

Remote Sensing , volume=

Reconstruction of Three-Dimensional Temperature and Salinity in the Equatorial Ocean with Deep-Learning , author=. Remote Sensing , volume=. 2025 , publisher=

2025
[17]

Frontiers in Marine Science , volume=

PGTransNet: a physics-guided transformer network for 3D ocean temperature and salinity predicting in tropical Pacific , author=. Frontiers in Marine Science , volume=. 2024 , publisher=

2024
[18]

International Journal of Applied Earth Observation and Geoinformation , volume=

Dual U--Vision--Transformer for reconstructing the three-dimensional eddy-resolving oceanic physical parameters from satellite observations , author=. International Journal of Applied Earth Observation and Geoinformation , volume=. 2025 , publisher=

2025
[19]

Earth Systems and Environment , volume=

Unveiling Regional Climate Patterns Through Global Subsurface Ocean Temperature Data: An AI Multi-Layer Analysis Framework , author=. Earth Systems and Environment , volume=. 2024 , publisher=

2024
[20]

IGARSS 2024-2024 IEEE International Geoscience and Remote Sensing Symposium , pages=

Reconstruction of Subsurface Temperature Anomaly in the West Pacific Ocean Based on Spatial Clustering Methods , author=. IGARSS 2024-2024 IEEE International Geoscience and Remote Sensing Symposium , pages=. 2024 , organization=

2024
[21]

Ocean modelling , volume=

The regional oceanic modeling system (ROMS): a split-explicit, free-surface, topography-following-coordinate oceanic model , author=. Ocean modelling , volume=. 2005 , publisher=

2005
[22]

Journal of Geophysical Research: Oceans , volume=

High-resolution gridded temperature and salinity fields from Argo floats based on a spatiotemporal four-dimensional multigrid analysis method , author=. Journal of Geophysical Research: Oceans , volume=. 2023 , publisher=

2023
[23]

Frontiers in Marine Science , volume=

Reconstruction of subsurface ocean state variables using Convolutional Neural Networks with combined satellite and in situ data , author=. Frontiers in Marine Science , volume=. 2023 , publisher=

2023
[24]

Earth System Science Data , volume=

Reconstructing ocean subsurface salinity at high resolution using a machine learning approach , author=. Earth System Science Data , volume=. 2022 , publisher=

2022
[25]

Remote sensing , volume=

A deep learning network to retrieve ocean hydrographic profiles from combined satellite and in situ measurements , author=. Remote sensing , volume=. 2020 , publisher=

2020
[26]

STDMamba: Spatio-Temporal Decomposition Mamba for Long-Term Fine-Grained SST Prediction , year=

Jiang, Xudong and Wang, Shuyu and Li, Wengen and Yang, Hanchen and Guan, Jihong and Zhang, Yichao and Zhou, Shuigeng , journal=. STDMamba: Spatio-Temporal Decomposition Mamba for Long-Term Fine-Grained SST Prediction , year=
[27]

IEEE Geoscience and Remote Sensing Letters , year=

Spatio-Temporal Attention Network for Chl-a Prediction with Sparse Multi-factor Observations , author=. IEEE Geoscience and Remote Sensing Letters , year=
[28]

2008 , publisher=

Introduction to physical oceanography , author=. 2008 , publisher=

2008
[29]

Climate Change 2022: Impacts, Adaptation, and Vulnerability , editor =

Changing Ocean, Marine Ecosystems, and Dependent Communities , author =. Climate Change 2022: Impacts, Adaptation, and Vulnerability , editor =. 2022 , pages =

2022
[30]

Oceanography , volume=

The Argo Program: Observing the global ocean with profiling floats , author=. Oceanography , volume=. 2009 , publisher=

2009
[31]

Reviews of Geophysics , volume=

A review of global ocean temperature observations: Implications for ocean heat content estimates and climate change , author=. Reviews of Geophysics , volume=. 2013 , publisher=

2013
[32]

Ocean Science Discussions , volume=

Monitoring ocean heat content from the current generation of global ocean observing systems , author=. Ocean Science Discussions , volume=
[33]

arXiv preprint arXiv:2412.13477 , year=

Generating unseen nonlinear evolution in sea surface temperature using a deep learning-based latent space data assimilation framework , author=. arXiv preprint arXiv:2412.13477 , year=

work page arXiv
[34]

Remote Sensing , volume=

Inversion of ocean subsurface temperature and salinity fields based on spatio-temporal correlation , author=. Remote Sensing , volume=. 2022 , publisher=

2022
[35]

Ocean Science , volume=

A clustering-based approach to ocean model--data comparison around Antarctica , author=. Ocean Science , volume=. 2021 , publisher=

2021
[36]

Remote Sensing of Environment , volume=

Subsurface temperature estimation from remote sensing data using a clustering-neural network method , author=. Remote Sensing of Environment , volume=. 2019 , publisher=

2019
[37]

Advances in Atmospheric Sciences , volume=

Subsurface Temperature and Salinity Structures Inversion Using a Stacking-Based Fusion Model from Satellite Observations in the South China Sea , author=. Advances in Atmospheric Sciences , volume=. 2025 , publisher=

2025
[38]

Remote Sensing , volume=

Presenting a long-term, reprocessed dataset of global sea surface temperature produced using the OSTIA system , author=. Remote Sensing , volume=. 2024 , publisher=

2024
[39]

Journal of Atmospheric and Oceanic Technology , volume=

Combining in situ and satellite observations to retrieve salinity and density at the ocean surface , author=. Journal of Atmospheric and Oceanic Technology , volume=. 2016 , doi =

2016
[40]

Remote sensing of environment , volume=

Multi-dimensional interpolation of SMOS sea surface salinity with surface temperature and in situ salinity data , author=. Remote sensing of environment , volume=. 2016 , publisher=

2016
[41]

Frontiers in Marine Science , volume=

A new global sea surface salinity and density dataset from multivariate observations (1993--2016) , author=. Frontiers in Marine Science , volume=. 2018 , publisher=

1993
[42]

Remote Sensing , volume=

Retrieving Mediterranean sea surface salinity distribution and interannual trends from multi-sensor satellite and in situ data , author=. Remote Sensing , volume=. 2022 , publisher=

2022
[43]

F. J. Wentz and J. Scott and R. Hoffman and M. Leidner and R. Atlas and J. Ardizzone. doi:10.5065/4TSY-K140

work page doi:10.5065/4tsy-k140