- 1Middle East Technical University, Department of Environmental Engineering, Türkiye (abdelrahman.habash@metu.edu.tr)
- 2HeavyFinance, Vilnius, Lithuania (Onur@heavyfinance.eu)
- 3Middle East Technical University, Department of Environmental Engineering, Türkiye (EmreAlp@metu.edu.tr)
Modeling rivers discharge is an essential tool for the sustainable management of freshwater systems, as it facilitates more efficient allocation and distribution strategies of water resources through accurate forecasting. Moreover, with the proper datasets and features engineering, it can also provide an accurate backcasting, thereby enhancing the understanding of the long-term effects of climate change on natural water bodies and providing valuable insights into the historical behavior of freshwater systems. With these objectives in mind, this study evaluates the capability of selected parameters of the ERA5-Land dataset in modeling rivers discharge using machine learning techniques. ERA5-Land is a widely acknowledged global reanalysis climate dataset known for its high temporal and spatial resolution, and made available by the Copernicus Climate Change Service (C3S). The research considers six diverse gauging stations across Switzerland, representing a variety of watershed characteristics and catchment sizes.
Two different data extraction schemes were employed through Google Earth Engine to process the ERA5-Land data: Station-based Climate Data (SCD) scheme, which extracts the climate data directly from the location of the station, and Catchment-based Climate Data (CCD) scheme, which aggregates the climate data over the entire catchment area of each station. Additionally, two feature engineering approaches were investigated. The (Raw features) approach, which used only the base climate parameters as model features, while the (Windowed features) approach utilized sliding windows with various temporal intervals for each parameter, and dynamically added the ones that have relative high Gini importance to the model features. Moreover, six machine learning methods were analyzed: Artificial Neural Networks (ANN), Long Short-Term Memory (LSTM), Hybrid Convolutional Neural Network-LSTM (CNN-LSTM), Random Forests (RF), eXtreme Gradient Boosting (XGBoost), and Support Vector Machines (SVM). Making a total of 24 models for each station.
The models were evaluated on their ability to capture the monthly average discharge (m3/s) at each gauging station. Promising results were achieved, with testing R2 values ranging from at least 0.80 to as high as 0.92, and MAPE values of 10-17%, demonstrating the strong predictive potential of the ERA5-Land dataset for modeling rivers discharge.
Key findings include the superior performance of the (CCD) over the (SCD) in terms of ERA5-Land climate data extraction scheme. Additionally, (Windowed features) approach improved the model’s accuracy in general, though the degree of improvement varied across stations. Among the tested machine learning methods, (CNN-LSTM) was the most consistent and robust method, performing the best mostly, and providing a very close performance to the best model in cases where it was not. Nevertheless, (ANN), (LSTM), and (XGBoost) methods are also worth considering, as they achieved the best performance in some stations, depending on the discharge patterns.
This study underscores the applicability of the ERA5-Land dataset for rivers discharge modeling and offers insights into the climate data processing, feature engineering strategies, and machine learning techniques for hydrological modeling. These findings contribute to advancing predictive hydrology and inform future applications in water resource management and climate impact assessment.
*For Figures/Tables with good quality: (https://drive.google.com/drive/folders/1j8iJR7MsHnGkyD4F5y1hFZskTdwvRJGn)
How to cite: Habash, A., Yuzugullu, O., and Alp, E.: Evaluation of ERA5-Land Dataset for Modeling Rivers Discharge using Machine Learning: A Comparative Analysis, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-883, https://doi.org/10.5194/egusphere-egu25-883, 2025.