- 1European Space Agency, Frascati (RM), Italy
- 2GIS, University of Stuttgart, Stuttgart, Germany
Quantifying the long-term evolution of the water cycle at the basin scale requires the estimation and integration of time series for various hydrological variables, e.g. precipitation, runoff, groundwater, and soil moisture, to name a few. The availability of Earth observation data, along with advancements in computational modelling and the expansion of in situ data networks, has led to a diverse array of products designed to estimate these variables. As a result, selecting the most appropriate products has become a significant challenge. This challenge is further complicated by the fact that estimates for a given variable can vary considerably across different products due to the inherent complexity of the variable or the uncertainties associated with the measurement process.
This study aims to tap into this wealth of products to provide single estimates of the key basin-scale hydrological variables involved in the water mass balance equation dS/dt=P−E−Q, namely precipitation rate (P), discharge (Q), evaporation rate (E) and terrestrial water storage (S), for the period 1990-2023. The approach is two-fold:
- To start, various products for P, E, and S are selected and pre-processed. The goal of this pre-processing is to address data gaps and extend certain products back to 1990. This is particularly relevant for water storage time series, as they depend on the GRACE and GRACE-FO missions, which was launched in April 2002 and suffer from numerous gaps. To tackle this issue, we jointly process the selected time series using low-rank matrix completion and approximation techniques. The key idea is to exploit the low-rank structure of the time series data matrix to recover the underlying noise- and gap-free matrix. In addition, we analyse the potential benefits of applying this pre-processing to the multi-channel Hankel data matrix in order to take into account the autocorrelation of the signals.
- The second step combines the pre-processed products by solving a constrained least-squares problem to generate a single estimate for each variable. This approach minimizes water mass balance misclosure while maintaining the non-negativity of discharge (Q≥0) and ensuring that each variable’s final estimate lies within the convex hulls defined by their respective time series products.
We conduct an extensive numerical analysis of the proposed method across 46 basins worldwide, using a selection of five products for precipitation, four for evaporation and four others for terrestrial water storage. Our results demonstrate that a rank-3 or rank-4 matrix strikes a good balance between data fitting and extrapolation, often reducing the average mass balance misclosure. The Hankel structure generally yields more robust and accurate results, although the optimal Hankel parameter and rank are not straightforward to determine and require further investigation. Finally, we validate the merged products by comparing them to independent estimates and assessing improvements in misclosure reduction.
How to cite: Douch, K., Naylor, P., and Saemian, P.: Hydrological data fusion: Joint gap-filling and back reconstruction via low-rank matrix approximation and completion , EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-6651, https://doi.org/10.5194/egusphere-egu25-6651, 2025.