- 1South China University of Technology, School of Civil Engineering and Transportation, Guangzhou, China (202311082029@mail.scut.edu.cn)
- 2South China University of Technology, School of Civil Engineering and Transportation, Guangzhou, China (huanggr@scut.edu.cn)
- 3Department of Hydraulic Engineering & State Key Laboratory of Hydroscience and Engineering, Tsinghua University, Beijing, China (zhengjx@mail.tsinghua.edu.cn )
Urban flooding is emerging as an increasingly severe global challenge due to climate change and urbanization. Although machine learning offers numerous solutions for urban flood forecasting, its application remains constrained. Existing research remains constrained by the scarcity of traditional hydrological monitoring data, and the absence of systematic comparisons across multiple models creates uncertainty when selecting the most suitable algorithms and features, making the decision-making mechanisms for selecting the most suitable algorithms and features remains unclear. To address these challenges, social media data was adopted as the sole basis in this study to evaluate and compare the performance of seven typical machine learning algorithms in urban flood forecasting. The Shapley Additive exPlanations (SHAP) framework was established, investigating the adaptability of the selected algorithms based on a multidimensional feature system while elucidating the decision-making mechanisms for selecting the most suitable algorithms and features. The results suggest that: (1) Social media data can serve as the sole source for precise urban flood identification, overcoming the real-time and spatial coverage limitations of traditional methods. (2) Different machine learning models show significant performance heterogeneity; reliance on a single model risks systematic bias, whereas ensemble tree models demonstrate superior predictive performance. (3) Feature importance is highly model-dependent, exhibiting contextual sensitivity and interactive influence mechanisms. Therefore, feature engineering should be based on multi-model consensus, prioritizing features with significant differences such as natural characteristics and risk exposure.
How to cite: Miao, R., Huang, R., and Zheng, J.: Adaptability of Multiple Social Media Data Integrated Machine Learning Algorithms in Urban Flood Forecasting using the SHAP Framework, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-3659, https://doi.org/10.5194/egusphere-egu26-3659, 2026.