Towards a global machine learning based impact model for tropical cyclones
- 1510 an initiative of the Netherlands Red Cross
- 2ISI Foundation
- 3UN OCHA Centre for humanitarian data
Due to its geographical location, the Philippines is prone to tropical cyclones (TC) which produce strong winds, accompanied by heavy rains and flooding of large areas, resulting in heavy casualties to human life and destruction to livelihoods and properties. To reduce the humanitarian impact of TC, the Philippine Red Cross with the German Red Cross and 510, an initiative of The Netherlands Red Cross, designed and implemented a machine learning impact-based forecasting model based on XGBoost, which is used operationally to release funding and to trigger early action. The model predicts the percentage of houses that will be completely damaged due to a TC using predictive features for the hazard (wind speed, rainfall, storm surge and landslides), exposure (such as ruggedness and population density) and vulnerability (such as housing material and poverty) . However, this model is not easily transferable to other countries, due to its use of country specific data from the Philippines.
Here, we develop upon this line of research around the XGBoost model, in three ways. First, we evaluate multiple ML algorithms for classification and regression of impact data of tropical storms. Secondly, we perform a sensitivity analysis on the predictive features, replacing where possible those features for which only Philippines-specific data sources can be used with features for which data from global open data sources are available. Thirdly, the XGBoost model provides predictions at the aggregated geographical level of a municipality. Our research centres on transforming it to a grid based model with a resolution of 0.1 x 0.1 latitude-longitude degrees. For all experiments, due to the scarcity and skewness of the training data (algorithms are trained on only 40 historical typhoon events), specific attention is paid to data stratification, sampling and validation techniques.
We find that XGBoost slightly outperforms random forest and that regression is more suitable to detect outliers than classification. Furthermore, we show that we can limit the predictive features from the original model to a subset of 20 features. The transformation to a grid-based model was possible by de-aggregating the impact data using OpenStreetMap housing data obtained from Humanitarian Data Exchange. Preliminary results show that the ML model performance worsens when going from municipality to grid-based level. This is likely caused by a larger error variance between the individual grid cells of a municipality which get averaged when aggregated. To conclude, relying on globally available data sources and working at grid level holds potential to render a machine learning based impact model generalisable and transferable to locations outside of the Philippines impacted by TCs. Future research will focus on validation with data for other countries. Ultimately, a transferable model will facilitate the scaling up of anticipatory action for tropical cyclones.
How to cite: Kooshki, M., van den Homberg, M., Kalimeri, K., Kaltenbrunner, A., Mejova, Y., Milano, L., Ndirangu, P., Paolotti, D., Teklesadik, A., and Turner, M.: Towards a global machine learning based impact model for tropical cyclones, EGU General Assembly 2023, Vienna, Austria, 24–28 Apr 2023, EGU23-14435, https://doi.org/10.5194/egusphere-egu23-14435, 2023.