EGU General Assembly 2022
© Author(s) 2022. This work is distributed under
the Creative Commons Attribution 4.0 License.

Partitioning of green-blue water fluxes around the world: ML model explainability and predictability

Daniel Althoff1 and Georgia Destouni2
Daniel Althoff and Georgia Destouni
  • 1Stockholm University, Department of Physical Geography, Stockholm 106 91, Sweden (
  • 2Stockholm University, Department of Physical Geography, Stockholm 106 91, Sweden (

The consequences of ever-increasing human interference with freshwater systems, e.g., through land-use and climate changes, are already felt in many regions of the world, e.g., by shifts in freshwater availability and partitioning between green (evapotranspiration) and blue (runoff) water fluxes around the world. In this study, we have developed a machine learning (ML) model for the possible prediction of green-blue water flux partitioning (WFP) under different climate, land-use, and other landscape and hydrological catchment conditions around the world. ML models have shown relatively high predictive performance compared to more traditional modelling methods for several tasks in geosciences. However, ML is also rightly criticized for providing theory-free “black-box” models that may fail in predictions under forthcoming non-stationary conditions. We here address the ML model interpretability gap using Shapley values, an explainable artificial intelligence technique. We also assess ML model predictability using a dissimilarity index (DI). For ML model training and testing, we use different parts of a total database compiled for 3482 hydrological catchments with available data for daily runoff over at least 25 years. The target variable of the ML model is the blue-water partitioning ratio between average runoff and average precipitation (and the complementary, water-balance determined green water partitioning ratio) for each catchment. The predictor variables are hydro-climatic, land-cover/use, and other catchment indices derived from precipitation and temperature time series, land cover maps, and topography data. As a basis for the ML modelling, we also investigate and quantify (through data averaging over moving sub-periods of different time lengths) a minimum temporal aggregation scale for water flux averaging (referred to as the flux equilibration time, Teq) required to reach a stable temporal average runoff (and evapotranspiration) fraction of precipitation in each catchment; for 99% of catchments, Teq is found to be ≤2 years, with longer Teq emerging for catchments estimated to have higher ratio Rgw/Ravg, i.e., higher groundwater flow contribution (Rgw) to total average runoff (Ravg). The cubist model used for the ML modelling yields a Kling-Gupta efficiency of 0.86, while the Shapley values analysis indicates mean annual precipitation and temperature as the most important variables in determining the WFP, followed by average slope in each catchment. A DI threshold is further used to label new data points as inside or outside the ML model area of applicability (AoA). Comparison between test data points outside and inside the AoA reveals which catchment characteristics are mostly responsible for ML model loss of predictability. Predictability is lower for catchments with: larger Teq and Rgw/Ravg; higher phase lag between peak precipitation and peak temperature over the year; lower forest and agricultural land fractions; and aridity index much higher or much lower than 1 (implying major water or energy limitation, respectively). Identifying such predictability limits is crucial for understanding, and facilitating user awareness of the applicability and forecasting ability of such data-driven ML modelling under different prevailing and changing future hydro-climatic, land-use, and groundwater conditions.

How to cite: Althoff, D. and Destouni, G.: Partitioning of green-blue water fluxes around the world: ML model explainability and predictability, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-8321,, 2022.

Comments on the display material

to access the discussion