Reconstructing GRACE Terrestrial Water Storage with Spatio-Temporal Graph Neural Networks: An Application to South America
Pith reviewed 2026-06-26 09:02 UTC · model grok-4.3
The pith
A graph neural network reconstructs GRACE terrestrial water storage back to 1940 from ERA5 data at basin correlations of 0.94.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The MTGNN architecture with a static hybrid adjacency matrix that combines geodesic proximity and lagged correlations of climatic time series reconstructs monthly GRACE-like TWSA from ERA5 precipitation, evapotranspiration and runoff, attaining a grid-cell Pearson correlation of 0.69, basin-mean correlation of 0.94 and near-zero bias while remaining statistically competitive with GTWS-MLrec, RM-REC and GRAiCE at basin scale despite using roughly half to a tenth of their predictors.
What carries the argument
Multi-variate time series graph neural network (MTGNN) equipped with a static hybrid adjacency matrix that merges geodesic proximity and lagged climatic correlations to capture local hydrological coupling and large-scale teleconnections.
If this is right
- The reconstruction supplies an 80-year TWS record suitable for climate-scale studies of variability and human impacts.
- All compared reconstruction methods share characteristic performance weaknesses in arid regions.
- The model reproduces the spatial fingerprints of major ENSO events such as the 2015/16 El Niño and 2020/21 La Niña.
- High basin-scale accuracy is maintained even when the number of meteorological predictors is reduced by a factor of two to ten.
Where Pith is reading between the lines
- The same graph architecture could be retrained on global rather than South-American data to produce consistent long-term TWS fields worldwide.
- The reduced predictor count may lower computational cost enough to support ensemble reconstructions under multiple climate scenarios.
- Longer TWS series could be combined with other observational records to separate natural from anthropogenic contributions to storage trends.
- The hybrid adjacency construction offers a template for incorporating additional geophysical constraints such as topography or soil properties.
Load-bearing premise
The statistical mapping learned between ERA5 forcing and GRACE TWS in the 2002-present overlap period continues to hold for the 1940-2001 period without major non-stationarities in the hydrological system.
What would settle it
A substantial drop in correlation or emergence of large bias when the trained model is tested against independent pre-2002 TWS estimates or against GRACE observations withheld from a later validation window.
Figures
read the original abstract
Terrestrial water storage (TWS) integrates snow, soil moisture, surface water, and groundwater and is a key indicator of how climate variability and human activity reshape the global water cycle. The GRACE and GRACE-FO satellite missions provide the only direct, globally consistent observations of TWS change, but their record only begins in 2002 which is too short for many climate-scale analyses. We present a deep learning application that reconstructs monthly GRACE-like TWS anomalies (TWSA) back to 1940 by learning the relationship between daily ERA5 meteorological forcing (precipitation, evapotranspiration, runoff) and monthly GRACE observations. In contrast to prior reconstruction approaches based on grid-cell-wise regression, CNNs, or LSTMs, we adapt a multi-variate time series graph neural network (MTGNN) architecture, which was originally developed for mobility and traffic forecasting on urban sensor networks to this satellite-geodesy task. Spatial dependencies are encoded in a static, interpretable hybrid adjacency matrix that combines geodesic proximity with lagged correlations of climatic time series, capturing both local hydrological coupling and large-scale teleconnections. The reconstruction achieves a grid-cell Pearson correlation of 0.69, a basin-mean correlation of 0.94, and a near-zero bias, and it reproduces the spatial fingerprints of the 2015/16 El Ni\~no and 2020/21 La Ni\~na events. A systematic comparison with established reconstruction approaches (GTWS-MLrec, RM-REC, GRAiCE) shows that the graph-based model is statistically competitive at basin scale, reaching a correlation within 0.025 of the best baseline while using only roughly half to a tenth of the predictors the other models require and revealing characteristic weaknesses in arid regions in all models. The complete implementation is publicly available at github.com/hcu-cml/MTGNN-TWS-Reconstruction-GRACE
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript adapts a multi-variate time series graph neural network (MTGNN) to reconstruct monthly GRACE-like terrestrial water storage anomalies (TWSA) over South America from daily ERA5 meteorological forcings (precipitation, evapotranspiration, runoff), using a static hybrid adjacency matrix that combines geodesic proximity and lagged climatic correlations. The model is trained on the 2002-present GRACE overlap and applied to produce a reconstruction back to 1940. Reported performance includes a grid-cell Pearson correlation of 0.69, basin-mean correlation of 0.94, near-zero bias, reproduction of 2015/16 El Niño and 2020/21 La Niña spatial fingerprints, and basin-scale competitiveness with GTWS-MLrec, RM-REC, and GRAiCE while using roughly half to one-tenth the predictors. The implementation is released publicly.
Significance. If the learned ERA5-to-TWSA mapping holds, the approach supplies a longer TWS record with substantially reduced predictor count and explicit public code, which is a clear strength for reproducibility. The basin-scale correlation and event reproduction are competitive, and the identification of shared weaknesses in arid regions across models is useful. Significance is limited by the absence of quantified uncertainty on the performance metrics and by the untested extrapolation assumption.
major comments (2)
- [Abstract] Abstract and comparison results: the statement that the model reaches a correlation 'within 0.025 of the best baseline' is presented without error bars, bootstrap intervals, or cross-validation statistics on the basin-scale metric, which is load-bearing for the competitiveness claim.
- [Reconstruction to 1940] Reconstruction to 1940 section: the central application of the trained MTGNN weights and hybrid adjacency to the 1940-2001 period rests on the assumption that P(TWSA | ERA5 forcings, graph) is stationary across the full interval, yet no regime-shift diagnostic, land-use proxy comparison, or pre-2002 validation is described.
Simulated Author's Rebuttal
We thank the referee for the constructive comments on our manuscript. We address each major point below, proposing revisions where they strengthen the work without misrepresenting the results.
read point-by-point responses
-
Referee: [Abstract] Abstract and comparison results: the statement that the model reaches a correlation 'within 0.025 of the best baseline' is presented without error bars, bootstrap intervals, or cross-validation statistics on the basin-scale metric, which is load-bearing for the competitiveness claim.
Authors: We agree that uncertainty quantification would better support the basin-scale competitiveness claim. In the revised manuscript we will add bootstrap confidence intervals (resampling over basins and years) to the reported basin-mean correlations, allowing readers to assess whether the 0.025 difference is statistically distinguishable from zero. revision: yes
-
Referee: [Reconstruction to 1940] Reconstruction to 1940 section: the central application of the trained MTGNN weights and hybrid adjacency to the 1940-2001 period rests on the assumption that P(TWSA | ERA5 forcings, graph) is stationary across the full interval, yet no regime-shift diagnostic, land-use proxy comparison, or pre-2002 validation is described.
Authors: The referee correctly notes the stationarity assumption required for the 1940–2001 extrapolation. Direct pre-2002 validation against GRACE is impossible. We will add (i) a regime-shift analysis on the ERA5 forcing distributions (e.g., Kolmogorov–Smirnov tests on precipitation and evapotranspiration across 1940–2001 vs. 2002–present) and (ii) an expanded discussion of the assumption’s limitations, including the lack of land-use change proxies. These additions will be included in the revised text. revision: partial
Circularity Check
No significant circularity; derivation is self-contained
full rationale
The paper trains an MTGNN on the 2002-present overlap between external ERA5 meteorological forcings and GRACE TWSA observations, then applies the learned mapping to pre-2002 ERA5 data. Reported metrics (grid-cell correlation 0.69, basin-mean 0.94) are measured against held-out GRACE data inside the overlap period and do not reduce to fitted parameters by construction. The hybrid adjacency matrix and MTGNN architecture are adapted from external traffic-forecasting literature with no load-bearing self-citations. The stationarity assumption for extrapolation is a correctness risk but is not a circularity issue per the evaluation rules.
Axiom & Free-Parameter Ledger
free parameters (1)
- MTGNN hyperparameters and weights
axioms (1)
- domain assumption The relationship between daily ERA5 meteorological variables and monthly TWS anomalies is stationary across 1940-present.
Reference graph
Works this paper leans on
-
[1]
Lukas Arzoumanidis, Julius Knechtel, Jan-Henrik Haunert, and Youness Dehbi
-
[2]
doi:10.1080/15230406.2025.2468304
Semantic Segmentation of Historical Maps Using Self-Constructing Graph Convolutional Networks.Cartography and Geographic Information Science53, 2 (2026), 177–187. doi:10.1080/15230406.2025.2468304
-
[3]
Akarsh Asoka and Vimal Mishra. 2020. Anthropogenic and Climate Contributions on the Changes in Terrestrial Water Storage in India.Journal of Geophysical Research: Atmospheres125, 10 (2020), 1–21. doi:10.1029/2020JD032470
-
[4]
Jules J. Berman. 2016. Chapter 4 - Understanding Your Data. InData Simplification. Morgan Kaufmann, Boston, 135–187. doi:10.1016/B978-0-12-803781-2.00004-7
-
[5]
Bucker, Ernest Pokropek, Willa Potosnak, Salomey Osei, and Björn Lütjens
Salva Rühling Cachay, Emma Erickson, Arthur Fender C. Bucker, Ernest Pokropek, Willa Potosnak, Salomey Osei, and Björn Lütjens. 2020. Graph Neural Networks for Improved El Niño Forecasting. InTackling Climate Change with Machine Learning Workshop at NeurIPS 2020. https://www.climatechange.ai/papers/ neurips2020/86
2020
-
[6]
Jianli Chen, Anny Cazenave, Christoph Dahle, William Llovel, Isabelle Panet, Julia Pfeffer, and Lorena Moreira. 2022. Applications and Challenges of GRACE and GRACE Follow-On Satellite Gravimetry.Surveys in Geophysics43, 1 (2022), 305–345. doi:10.1007/s10712-021-09685-x
-
[7]
Minkang Cheng, Byron D. Tapley, and John C. Ries. 2013. Deceleration in the Earth’s Oblateness.Journal of Geophysical Research: Solid Earth118, 2 (2013), 740–747. doi:10.1002/jgrb.50058
-
[8]
Gabriel Jonas da Silva Duarte, Tamara Arruda Pereira, Erik Jhones Fernandes Nascimento, Diego Parente Paiva Mesquita, and Amauri Holanda de Souza Jr. 2021. How Do Loss Functions Impact the Performance of Graph Neural Networks?. InCongresso Brasileiro de Inteligência Computacional (CBIC 2021). doi:10.21528/ CBIC2021-161
2021
-
[9]
Annette Eicker, Laura Jensen, Viviana Wöhnke, Henryk Dobslaw, Andreas Kvas, Torsten Mayer-Gürr, and Robert Dill. 2020. Daily GRACE Satellite Data Evaluate Short-term Hydro-meteorological Fluxes from Global Atmospheric Reanalyses. Scientific Reports10, 1 (2020), 4504. doi:10.1038/s41598-020-61166-0
-
[10]
2020.CCI Sea Level Budget Closure: Product Validation and Intercomparison Report (PVIR)
European Space Agency. 2020.CCI Sea Level Budget Closure: Product Validation and Intercomparison Report (PVIR). Technical Report D4.7 v1.1. Climate Change Initiative. https://climate.esa.int/documents/192/ESA_SLBC_cci_D4.7_v1.1.pdf
2020
-
[11]
Gentner, Junyang Gou, Mohammad J
Luis Q. Gentner, Junyang Gou, Mohammad J. Tourian, Lara Börger, Nico Sneeuw, and Benedikt Soja. 2026. DeepRec: Global Terrestrial Water Storage Recon- struction Since 1941 Using Spatiotemporal-Aware Deep Learning Model.Jour- nal of Geophysical Research: Machine Learning and Computation3, 1 (2026). doi:10.1029/2025JH000889
-
[12]
Jinyun Guo, Dapeng Mu, Xin Liu, Haoming Yan, and Honglei Dai. 2014. Equiv- alent Water Height Extracted from GRACE Gravity Field Model with Robust Independent Component Analysis.Acta Geophysica62, 4 (01 Aug 2014), 953–972. doi:10.2478/s11600-014-0210-0
-
[13]
Yi Guo, Naichen Xing, Fuping Gan, Baikun Yan, and Juan Bai. 2023. Evaluating the Hydrological Components Contributions to Terrestrial Water Storage Changes in Inner Mongolia with Multiple Datasets.Sensors23, 14 (2023). doi:10.3390/ s23146452
2023
-
[14]
Hans Hersbach, Bill Bell, Paul Berrisford, Giovanni Biavati, András Horányi, Joaquín Muñoz Sabater, Julien Nicolas, Carole Peubey, Raluca Radu, Iryna Rozum, Dinand Schepers, Adrian Simmons, Cornel Soci, Dick Dee, and Jean-Noël Thépaut
-
[15]
Copernicus Cli- mate Change Service (C3S) Climate Data Store (CDS)
ERA5 Hourly Data on Single Levels from 1940 to Present. Copernicus Cli- mate Change Service (C3S) Climate Data Store (CDS). doi:10.24381/cds.adbb2d47
-
[16]
Hans Hersbach, Bill Bell, Paul Berrisford, Shoji Hirahara, András Horányi, Joaquín Muñoz-Sabater, Julien Nicolas, Carole Peubey, Raluca Radu, Dinand Schepers, Adrian Simmons, Cornel Soci, Saleh Abdalla, Xavier Abellan, Gianpaolo Bal- samo, Peter Bechtold, Gionata Biavati, Jean Bidlot, Massimo Bonavita, Giovanna De Chiara, Per Dahlgren, Dick Dee, Michail D...
-
[17]
Ruixi Huang, Yin Long, and Tehseen Zia. 2025. A Physical-Enhanced Spatio- Temporal Graph Convolutional Network for River Flow Prediction.Applied Sciences15, 16 (2025). doi:10.3390/app15169054
-
[18]
Vincent Humphrey and Lukas Gudmundsson. 2019. GRACE-REC: A Reconstruc- tion of Climate-driven Water Storage Changes over the Last Century.Earth System Science Data11, 3 (2019), 1153–1170. doi:10.5194/essd-11-1153-2019
-
[19]
Laura Jensen, Annette Eicker, Henryk Dobslaw, and Roland Pail. 2020. Emerging Changes in Terrestrial Water Storage Variability as a Target for Future Satellite Gravity Missions.Remote Sensing12, 23 (2020). doi:10.3390/rs12233898
-
[20]
Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Op- timization. In3rd International Conference on Learning Representations (ICLR). https://arxiv.org/abs/1412.6980
Pith/arXiv arXiv 2015
-
[21]
Nikolas Kirschstein and Yixuan Sun. 2024. The merit of river network topology for neural flood forecasting. InProceedings of the 41st International Conference on Machine Learning(Vienna, Austria)(ICML’24). Article 990, 13 pages
2024
-
[22]
Enrico Kurtenbach, Annette Eicker, Torsten Mayer-Gürr, Matthias Holschneider, Hayn Hayn, Marcel Fuhrmann, and Jürgen Kusche. 2012. Improved daily GRACE gravity field solutions using a Kalman smoother.Journal of Geodynamics59–60 (2012), 39–48. doi:10.1016/j.jog.2012.02.006
-
[23]
Felix W. Landerer, Frank M. Flechtner, Himanshu Save, Frank H. Webb, Tamara Bandikova, William I. Bertiger, Srinivas V. Bettadpur, Sung Hun Byun, Christoph Dahle, Henryk Dobslaw, Eugene Fahnestock, Nate Harvey, Zhigui Kang, Gerhard L. H. Kruizinga, Bryant D. Loomis, Christopher McCullough, Michael Murböck, Peter Nagel, Meegyeong Paik, Nadege Pie, Steve Po...
-
[24]
Fupeng Li, Jürgen Kusche, Nengfang Chao, Zhengtao Wang, and Anno Löcher
-
[25]
Long-Term (1979-Present) Total Water Storage Anomalies Over the Global Land Derived by Reconstructing GRACE Data.Geophysical Research Letters48, 8 (2021), 1–10. doi:10.1029/2021GL093492
-
[26]
Tourian, Nico Sneeuw, and Johannes Riegger
Christof Lorenz, Harald Kunstmann, Balaji Devaraju, Mohammad J. Tourian, Nico Sneeuw, and Johannes Riegger. 2014. Large-Scale Runoff from Landmasses: A Global Assessment of the Closure of the Hydrological and Atmospheric Water Balances.Journal of Hydrometeorology15, 6 (2014), 2111–2139. doi:10.1175/JHM- D-13-0157.1
-
[27]
Kuang Luo, Jingshang Zhao, Yingping Wang, Jiayao Li, Junjie Wen, Jiong Liang, Henry Soekmadji, and Shaolin Liao. 2025. Physics-Informed Neural Networks for PDE Problems: A Comprehensive Review.Artificial Intelligence Review58, 10 (2025), 323. doi:10.1007/s10462-025-11322-7
-
[28]
Torsten Mayer-Gürr, Saniya Behzadpour, Annette Eicker, Matthias Ellmer, Beate Koch, Sandro Krauss, Christian Pock, Daniel Rieser, Sebastian Strasser, Barbara Süsser-Rechberger, Norbert Zehentner, and Andreas Kvas. 2021. GROOPS: A Software Toolkit for Gravity Field Recovery and GNSS Processing.Computers & Geosciences155 (2021), 104864. doi:10.1016/j.cageo....
-
[29]
Torsten Mayer-Gürr, Saniya Behzadpur, Matthias Ellmer, Andreas Kvas, Beate Klinger, Sebastian Strasser, and Norbert Zehentner. 2018. ITSG-Grace2018 – Monthly, Daily and Static Gravity Field Solutions from GRACE. doi:10.5880/ ICGEM.2018.003
2018
-
[30]
Irene Palazzoli, Serena Ceola, and Pierre Gentine. 2025. GRAiCE: reconstructing terrestrial water storage anomalies with recurrent neural networks.Scientific Data12, 1 (2025), 146. doi:10.1038/s41597-025-04403-3
-
[31]
Nijia Qian, Guobin Chang, Pavel Ditmar, Jingxiang Gao, and Zhengqiang Wei
-
[32]
Sparse DDK: A Data-Driven Decorrelation Filter for GRACE Level-2 Products.Remote Sensing14, 12 (2022). doi:10.3390/rs14122810
-
[33]
Zuzana Reitermanová. 2010. Data Splitting. InWDS’10 Proceedings of Contributed Papers: Part I – Mathematics and Computer Sciences. 31–36
2010
-
[34]
Filipi N. Silva, Didier A. Vega-Oliveros, Xiaoran Yan, Alessandro Flammini, Filippo Menczer, Filippo Radicchi, Ben Kravitz, and Santo Fortunato. 2021. Detecting Climate Teleconnections With Granger Causality.Geophysical Research Letters 48, 18 (2021). doi:10.1029/2021GL094707
-
[35]
Dalwinder Singh and Birmohan Singh. 2022. Feature wise normalization: An effective way of normalizing data.Pattern Recognition122 (2022), 108307. doi:10. 1016/j.patcog.2021.108307
arXiv 2022
-
[36]
Alexander Y. Sun, Peishi Jiang, Maruti K. Mudunuru, and Xingyuan Chen. 2021. Explore Spatio-Temporal Learning of Large Sample Hydrology Using Graph Neu- ral Networks.Water Resources Research57, 12 (2021). doi:10.1029/2021WR030394
-
[37]
Yu Sun, Pavel Ditmar, and Riccardo Riva. 2017. Statistically optimal estimation of degree-1 and C20 coefficients based on GRACE data and an ocean bottom pressure model.Geophysical Journal International210, 3 (2017), 1305–1322. doi:10.1093/gji/ggx241
-
[38]
B. D. Tapley, S. Bettadpur, M. Watkins, and C. Reigber. 2004. The gravity recovery and climate experiment: Mission overview and early results.Geophysical Research Letters31, 9 (2004). doi:10.1029/2004GL019920
-
[39]
Byron D. Tapley, Michael M. Watkins, Frank Flechtner, Christoph Reigber, Srini- vas Bettadpur, Matthew Rodell, Ingo Sasgen, James S. Famiglietti, Felix W. Lan- derer, Don P. Chambers, John T. Reager, Alex S. Gardner, Himanshu Save, Erik R. Ivins, Sean C. Swenson, Carmen Boening, Christoph Dahle, David N. Wiese, Henryk Dobslaw, Mark E. Tamisiea, and Isabel...
-
[40]
Isaac Ronald Ward, Jack Joyner, Casey Lickfold, Yulan Guo, and Mohammed Bennamoun. 2022. A Practical Tutorial on Graph Neural Networks.ACM Comput. Surv.54, 10s (2022), 35 pages. doi:10.1145/3503043
-
[41]
Zonghan Wu, Shirui Pan, Fengwen Chen, Guodong Long, Chengqi Zhang, and Philip S. Yu. 2021. A Comprehensive Survey on Graph Neural Networks.IEEE Transactions on Neural Networks and Learning Systems32, 1 (2021), 4–24. doi:10. 1109/TNNLS.2020.2978386
arXiv 2021
-
[42]
Zonghan Wu, Shirui Pan, Guodong Long, Jing Jiang, Xiaojun Chang, and Chengqi Zhang. 2020. Connecting the Dots: Multivariate Time Series Forecasting with Graph Neural Networks. InProceedings of the 26th ACM SIGKDD International Conference acronym ’XX, June 03–05, 2018, Woodstock, NY Arzoumanidis et al. Conference on Knowledge Discovery & Data Mining(Virtua...
-
[43]
Slater, Abdou Khouakhi, Le Yu, Pan Liu, Fupeng Li, Yadu Pokhrel, and Pierre Gentine
Jiabo Yin, Louise J. Slater, Abdou Khouakhi, Le Yu, Pan Liu, Fupeng Li, Yadu Pokhrel, and Pierre Gentine. 2023. GTWS-MLrec: global terrestrial water storage reconstruction by machine learning from 1940 to present.Earth System Science Data15, 12 (2023), 5597–5615. doi:10.5194/essd-15-5597-2023
-
[44]
Qianheng Zhang, Dev Paul, Michelle Miller, Alfonso Morales, and Song Gao. 2025. Scalable Inter-County Food Flow Prediction Using Graph Neural Networks. In Proceedings of the 33rd ACM International Conference on Advances in Geographic Information Systems(The Graduate Hotel Minneapolis, Minneapolis, MN, USA) (SIGSPATIAL ’25). Association for Computing Machi...
-
[45]
Jie Zhou, Ganqu Cui, Shengding Hu, Zhengyan Zhang, Cheng Yang, Zhiyuan Liu, Lifeng Wang, Changcheng Li, and Maosong Sun. 2020. Graph neural networks: A review of methods and applications.AI Open1 (2020), 57–81. doi:10.1016/j. aiopen.2021.01.001
work page doi:10.1016/j 2020
-
[46]
2002.Learning from Labeled and Unlabeled Data with Label Propagation
Xiaojin Zhu and Zoubin Ghahramani. 2002.Learning from Labeled and Unlabeled Data with Label Propagation. Technical Report CMU-CALD-02-107. Carnegie Mellon University, School of Computer Science. https://mlg.eng.cam.ac.uk/ zoubin/papers/CMU-CALD-02-107.pdf
2002
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.