- 1The National Key Laboratory of Water Disaster Prevention, Hohai University, Nanjing, China
- 2School of Civil and Environmental Engineering, University of New South Wales, Sydney, Australia
Extreme precipitation events have become increasingly frequent and intense in recent decades, resulting in severe flooding and substantial socio-economic losses. These events are typically associated with intense weather systems that vary across numerous meteorological factors and exhibit significant temporal and spatial variability. A comprehensive understanding of the underlying processes and the identification of key meteorological factors driving extreme precipitation are critical for enhancing the accuracy of extreme rainstorm predictions and flood warnings.
This study utilized cumulative distribution function (CDF) analysis based on ERA5 hourly reanalysis data and employed the eXtreme Gradient Boosting (XGBoost) algorithm to identify the key meteorological factors contributing to 24-hour extreme precipitation across three distinct climatic zones in China. Additionally, forecasting models were developed to predict these events. The results highlighted the efficacy of this methodology and demonstrated its ability to achieve the following key advancements:
- Mapping data into the CDF space effectively addressed the challenges posed by the spatial heterogeneity in the value ranges of meteorological factors in regional system analyses, thereby significantly enhancing the spatial scalability of the predictive model.
- The integration of SHAP (SHapley Additive exPlanations) value interpretation with XGBoost successfully identified the critical meteorological factors influencing extreme precipitation events. This facilitated the construction of classification and regression models to predict both the occurrence and the return periods of these events.
- The application of SHAP values enhanced the interpretability of the "black-box" XGBoost model by incorporating physical insights and elucidating the interactions between different factors, thus providing valuable information for the construction and refinement of the final model.
In summary, this study presents a novel and interpretable machine learning framework for analyzing and predicting extreme precipitation events based on the CDF analysis. By effectively addressing spatial heterogeneity and enhancing model interpretability, the proposed methodology offers significant advancements in the prediction of extreme rainfall and associated flood risks, contributing to improved disaster preparedness and mitigation efforts.
How to cite: Wu, X., Jiang, Z., and Sharma, A.: Predicting extreme precipitation events using machine learning techniques based on cumulative distribution function (CDF) analysis of meteorological factors, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-13959, https://doi.org/10.5194/egusphere-egu25-13959, 2025.