
Earth Sciences
Global high-resolution total water storage anomalies from self-supervised data assimilation using deep learning algorithms
J. Gou and B. Soja
Discover a groundbreaking self-supervised data assimilation model by researchers Junyang Gou and Benedikt Soja that accurately captures global total water storage anomalies (TWSAs) using advanced satellite data. This innovative approach enhances local natural hazard monitoring and reveals insights into the water cycle's dynamics influenced by human activities.
~3 min • Beginner • English
Introduction
Monitoring variations in the global water cycle is crucial for understanding Earth’s climate system across time scales, from long-term trends like ice-sheet melting and freshwater availability to short-term extremes such as floods and droughts. Total water storage (TWS) is an essential climate variable that has traditionally been provided by global hydrological and land surface models, which offer high spatial detail and short-term variability but struggle with reliable long-term trends. Since 2002, GRACE and GRACE-FO have uniquely observed global TWS anomalies (TWSAs) with high accuracy and coverage but at coarse effective spatial resolution (~3°), limiting applications in small catchments. The coarse resolution stems from orbital/instrument limits and required postprocessing that attenuates high-frequency signals. Incorporating higher-resolution information from hydrological models and other measurements is necessary to downscale GRACE TWSAs. Existing data assimilation and downscaling approaches have shown regional success; globally generalizable methods at 0.5° exist but face trade-offs such as insufficient intrabasin variability or interbasin mass conservation. Recent deep learning approaches have been applied locally, typically requiring synthetic training pairs or domain assumptions due to the lack of high-resolution ground truth. This study introduces a globally generalizable, self-supervised deep learning data assimilation framework that combines GRACE(-FO) and WaterGAP Hydrology Model (WGHM) TWSAs with hydrological forcings to produce 0.5° global TWSAs. A novel loss function aligns patch-averaged predictions with GRACE while enforcing structural similarity to WGHM, removing the need for synthetic labels or cross-domain assumptions. The approach aims to preserve high-resolution structures and large-scale mass accuracy, improving water balance closure in basins smaller than the GRACE-effective resolution and enabling local hazard monitoring.
Literature Review
Prior hydrological modelling provides fine spatial detail and short-term variability but underestimates or misrepresents long-term trends in TWS due to limited representation of climate and anthropogenic impacts. GRACE/GRACE-FO mission products offer accurate global TWSA observations but at coarse spatial resolution, constrained by orbital/instrument design and filtering-induced signal attenuation. Downscaling and data assimilation methods have incorporated higher-resolution hydrological information (e.g., WGHM, precipitation) with techniques such as partial least squares regression and ensemble Kalman filters; global products at 0.5° exist but show deficiencies (e.g., intrabasin variability preservation, mass conservation). Deep learning and classical ML methods have been explored for GRACE downscaling, often relying on synthetic training pairs by blurring high-resolution simulations or using downsampled pairs, assuming cross-resolution or simulation–measurement relationships hold. These methods have largely been applied regionally due to generalization challenges. There is a need for globally generalizable approaches that avoid strong assumptions and can self-supervise using available observations and simulations.
Methodology
Data sources and study domain: The product covers global land (except Greenland and Antarctica) from April 2002 to December 2019 at 0.5°. GRACE/GRACE-FO JPL mascon solutions (RL06, without land grid scaling gain factors) are used, with processing including C20 replacement, GIA correction, and removal of ocean/atmosphere/land ice effects; monthly means from 2004.0–2009.999 are removed to form TWSA in equivalent water height (EWH). WGHM v2.2d provides monthly TWS at 0.5°, converted to TWSA by removing the same baseline mean; WGHM does not assimilate GRACE and is independent but noisier and less accurate in magnitude. Additional features come from GLDAS Noah L4 v2.1 (0.25° monthly, downsampled to 0.5° by averaging): precipitation (P), evapotranspiration (ET), and runoff split into storm surface runoff, baseflow-groundwater runoff, and snowmelt runoff. Basin boundaries from HydroBasins (levels 1, 3, 4) support evaluation.
Feature selection and preprocessing: Nine input channels per 0.5° pixel: GRACE TWSA, WGHM TWSA, P, ET, three runoff components (storm surface, baseflow-groundwater, snowmelt), latitude, longitude. Features are normalized using the 0.01th and 99.99th percentiles to reduce outlier influence. The land domain is split into overlapping 16° × 16° patches (32 × 32 pixels at 0.5°). Ocean pixels within patches are filled with the patch-average of valid land pixels. Patches are processed as 9-channel images.
Model architecture: A convolutional encoder–decoder with residual learning and batch normalization is used. Encoder: three Conv2D layers (kernel size 3, strides 2; channel depths 16, 32, 64) each followed by ReLU and residual blocks, progressively reducing spatial size and increasing receptive field. Decoder: bilinear Upsampling2D followed by Conv2D (kernel size 3, stride 1; channel depths 64, 32, 16), ReLU, and residual blocks, reconstructing full resolution. A final 1×1 Conv2D maps features to TWSA outputs (32×32). Residual connections facilitate learning and robustness; batch normalization aids optimization stability.
Self-supervised loss function: The objective balances two goals: (1) match GRACE values at scales larger than the GRACE-effective resolution by comparing patch-averaged outputs to patch-averaged GRACE TWSAs using an absolute error term AE_c; (2) preserve high-resolution structure by maximizing Pearson correlation R between outputs and WGHM patches and minimizing their mean absolute error MAE_w. The batch loss is L = mean over patches of [AE_c + (1 − R) × MAE_w]. L1 metrics (AE, MAE) are used for robustness to outliers. This design removes the need for synthetic labels and explicit cross-domain assumptions, enabling self-supervised optimization using inputs themselves (GRACE and WGHM).
Training: Implemented in TensorFlow 2.6, optimized with Adam (default parameters), batch size 512. Initial experiments on four basins with random month splits indicated convergence in 120–150 epochs and no overfitting; the final global model is trained using all 180 months (Apr 2002–Dec 2019) for uniform quality. Training time is ~3 days on a consumer GPU (NVIDIA RTX 3080 Ti).
Uncertainty estimation (deep ensembles + Monte Carlo): Five independently initialized models are trained (deep ensemble). For each model, 20 Monte Carlo samples of GRACE inputs are drawn based on GRACE uncertainties to propagate input uncertainty. Ensemble mean and total predictive uncertainty are computed from model means and variances aggregated over the ensemble and MC runs. Uncertainties of WGHM and GLDAS inputs are not available and thus not included, leading to potential underestimation.
Evaluation protocols: High-resolution structural fidelity is assessed via pixel-wise Pearson correlation between downscaled and WGHM TWSAs globally. Basin-scale magnitude accuracy is assessed by basin-averaged RMSE between downscaled and GRACE TWSAs over 288 basins. Temporal components (trend, annual, semi-annual) are estimated for 160 basins larger than 200,000 km²; correlations with GRACE-derived signals assess fidelity. Water balance closure beyond GRACE resolution is evaluated by comparing TWS change (TWSC) from downscaled TWSAs to ERA5-Land water budget components (P, ET, R) via Nash–Sutcliffe efficiency (NSE) in HydroBasins level-4 basins. GRACE and downscaled TWSAs are interpolated to mid-month using PCHIP; TWSC is computed by centered finite differences, and budget components are smoothed with a 3-point filter. Environmental indices (flooding potential index, FPI; drought severity index, DSI) are derived from TWSA and ERA5-Land precipitation to demonstrate downstream applicability.
Key Findings
- Produced a global 0.5° TWSA product (Apr 2002–Dec 2019) covering all land except Greenland and Antarctica, with global median uncertainty of 7.3 mm EWH. Outputs are visually smoother than WGHM, reducing outliers while retaining river system details.
- Pixel-wise structural fidelity: Median Pearson correlation between downscaled and WGHM TWSAs is 0.80 globally (a 51% improvement over GRACE’s 0.53), with lower correlations mainly in arid regions due to weak hydrological signals.
- Basin-scale magnitude accuracy: Basin-averaged RMSE vs GRACE is 21.9 mm (area-weighted) across 288 basins, comparable to typical GRACE uncertainties (20–30 mm) and a ~56% improvement over WGHM (weighted RMSE 49.2 mm). Higher RMSEs occur in glaciated regions (e.g., Alaska) due to glacier modelling limitations and leakage.
- Temporal components (160 basins >200,000 km²): Trend correlation with GRACE improves from 0.47 (WGHM) to 0.94 (downscaled). Annual and semi-annual amplitude correlations improve from 0.83 (WGHM) to 0.97 and 0.95, respectively. Annual phase of downscaled agrees closely with GRACE, correcting WGHM’s 2–3 month phase shifts in many basins (except weak-signal regions in North Africa).
- Water balance closure: Positive NSE in 83% of studied level-4 basins for downscaled TWSC vs budget-derived TWSC, compared to 77% for GRACE and 75% for WGHM. NSE improvement vs GRACE increases with decreasing basin size: +0.13 for >200,000 km², +0.21 for 63,000–200,000 km², and +1.21 for <63,000 km². Improvements vs WGHM are size-insensitive, reflecting better magnitude calibration.
- Applications: Downscaled trends highlight localized groundwater depletion hotspots (e.g., High Plains aquifer, Mississippi embayment, California’s Central Valley) not evident in GRACE due to spatial averaging. Derived FPI and DSI show improved realism vs WGHM by suppressing outliers, aligning better with GRACE while providing higher spatial detail for local hazard monitoring.
Discussion
The study addresses the fundamental challenge of GRACE’s coarse spatial resolution by assimilating high-resolution hydrological model structure (WGHM) with GRACE’s large-scale magnitude accuracy via a self-supervised CNN. The loss function enforces agreement with GRACE at patch-averaged scales while preserving WGHM’s spatial patterns, eliminating the need for synthetic training targets and improving global generalization. The resulting product substantially enhances structural detail and basin-wise accuracy, improves trend and seasonal signal fidelity, and enables better water balance closure in basins below the GRACE-effective resolution. This directly supports hydrologic and climate analyses at local to regional scales, including monitoring anthropogenic impacts and natural hazards. Comparisons with GRACE and WGHM demonstrate that the approach effectively calibrates magnitudes and denoises high-resolution features, while retaining temporal characteristics present in observations. Derived indices (FPI, DSI) benefit from reduced outliers and realistic spatial patterns, enhancing operational utility for flood and drought risk assessment. Some regional limitations persist, notably in glaciated and arid areas where model deficiencies or weak signals constrain performance; these reflect trade-offs between global generalizability and local specialization and suggest targeted model or data enhancements.
Conclusion
This work delivers a global 0.5° TWSA product (2002–2019) using a novel self-supervised deep learning data assimilation framework that combines GRACE (-FO) observations with WGHM simulations and hydrological forcings. The method preserves high-resolution spatial structure while maintaining basin-scale mass accuracy, yielding strong improvements in trend/seasonal fidelity and water balance closure beyond GRACE’s effective resolution. The product enables refined analyses of climate variability, groundwater depletion, and local hazard monitoring through derived indices (FPI, DSI). The approach is computationally efficient (global training in ~3 days on a consumer GPU) and amenable to timely updates (online learning within ~1 hour), supporting operational delivery. Future work should enhance glacier representation, incorporate detailed human intervention data (e.g., population, irrigated area, water use), and integrate additional physical constraints (e.g., interactions among hydrosphere components) to further improve accuracy and robustness. The framework is broadly applicable to other TWSA sources and can benefit the geoscience community and society.
Limitations
- Regional limitations in glaciated areas (e.g., Alaska) due to insufficient glacier/ice-sheet modelling in hydrological simulations and leakage effects; higher basin-wise RMSEs observed.
- Lower performance in arid regions with weak hydrological signals, affecting correlations and phase estimates.
- Dependence on hydrological model quality: strong simulated trends can influence downscaled trends; caution is advised in interpretation.
- Uncertainty estimates exclude uncertainties from WGHM and GLDAS inputs (only GRACE uncertainties propagated), likely underestimating total uncertainty.
- Trade-off between global generalizability and local performance; a single global model may underperform in specific regions with unique dynamics or strong anthropogenic effects.
- Spatial coverage excludes Greenland and Antarctica due to hydrological model deficiencies.
- Temporal limitation: downscaled product inherits GRACE’s monthly cadence and record length; it increases spatial resolution but does not extend observational periods.
Related Publications
Explore these studies to deepen your understanding of the subject.