Evaluation of Various Machine Learning Algorithms for Bias Correction of Satellite-based Precipitation Estimates over Complex Topography

Koray K. Yilmaz; Gökhan Sevinç; Çağdaş Sağır; Orhan Karaman; M. Tugrul Yilmaz; Ismail Yucel

doi:https://doi.org/10.5194/egusphere-egu24-18456

[Back] [Session HS7.2]

EGU24-18456, updated on 11 Mar 2024

https://doi.org/10.5194/egusphere-egu24-18456

EGU General Assembly 2024

© Author(s) 2024. This work is distributed under
the Creative Commons Attribution 4.0 License.

Evaluation of Various Machine Learning Algorithms for Bias Correction of Satellite-based Precipitation Estimates over Complex Topography

Koray K. Yilmaz¹, Gökhan Sevinç¹, Çağdaş Sağır¹, Orhan Karaman¹, M. Tugrul Yilmaz

², and Ismail Yucel

²

Koray K. Yilmaz et al.

¹Middle East Technical University, Department of Geological Engineering, Ankara, Türkiye
²Middle East Technical University, Department of Civil Engineering, Ankara, Türkiye

Reliable precipitation estimates are crucial for any hydrologic study. Representation of high spatio-temporal variability in precipitation using rain gauges is challenging over complex terrain. Geographical variability of Türkiye, such as orography, land–sea distribution and the high Anatolian peninsula strongly controls the climate and results in highly variable climate regimes. The objective of this study is the evaluation of tree-based machine learning algorithms (Random Forest & XGBoost) for bias correction of IMERGLate precipitation estimates over complex topography and climatic regimes. We utilized SHAP values to improve the transparency and the interpretability of machine learning models, thus to better understand the factors controlling the bias correction models. 301 quality-controlled rain gauges (244 for training and 57 for testing) were used, covering a 600 km wide North-South region from the Black Sea coast to the Mediterranean coast. The selected explanatory variables consist of daily IMERG precipitation estimates and probability of liquid precipitation, climate zones, aspect, elevation, distance to coast, effective terrain height, longitude and latitude. The results showed that both Random Forest and XGBoost algorithms significantly improved precipitation estimates. While the Random Forest Model provided better correlations, the XGBoost Model performed better in correcting the precipitation distribution. Both models show high performance in error correction and have similar Kling-Gupta performance. Analysis of SHAPLEY values showed that the IMERG product, effective terrain height, distance to coast and elevation are the most important variables in the precipitation bias correction process.

How to cite: Yilmaz, K. K., Sevinç, G., Sağır, Ç., Karaman, O., Yilmaz, M. T., and Yucel, I.: Evaluation of Various Machine Learning Algorithms for Bias Correction of Satellite-based Precipitation Estimates over Complex Topography, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-18456, https://doi.org/10.5194/egusphere-egu24-18456, 2024.