EGU22-2001
https://doi.org/10.5194/egusphere-egu22-2001
EGU General Assembly 2022
© Author(s) 2022. This work is distributed under
the Creative Commons Attribution 4.0 License.

Text-mining of natural hazard impacts (TM-Impacts): an application to the 2021 flood in Germany

Mariana Madruga de Brito1, Jan Sodoge1, Heidi Kreibich2, and Christian Kuhlicke1
Mariana Madruga de Brito et al.
  • 1Helmholtz-Centre for Environmental Research, Department of Urban and Environmental Sociology, Leipzig, Germany (mariana.brito@ufz.de)
  • 2German Research Centre for Geosciences, Potsdam, Germany (heidi.kreibich@gfz-potsdam.de)

Natural hazards cause a plethora of impacts on society, ranging from direct impacts such as loss of lives to cascading ones such as power outages and supply shortages. Despite the severe social and economic losses of extreme events, a comprehensive assessment of their impacts remains largely missing. Existing studies tend to focus on impacts that are relatively easy to measure (e.g. financial loss, number of deaths) and commonly break down impact assessments into specific sectors (e.g. forestry, agriculture). Thus, in the absence of multi-sector impact datasets, decision-makers have no baseline information for evaluating whether adaptation measures effectively reduce impacts. This can result in blind spots in adaptation.

In recent years, text data (e.g. newspapers, social media, and Wikipedia entries) have been used to elaborate impact datasets. However, the manual extraction of impact information by human experts is a time-consuming task. To develop comprehensive impact datasets, we propose using text-mining on text documents. We developed a tool termed TM-Impacts (text-mining of natural hazard impacts), which allows us to automatically extract information on impacts by applying natural language processing (NLP) and machine learning (ML) tools to text-corpora. TM-Impacts is built upon a previous prototype application (de Brito et al., 2020).

TM-Impacts consists of three complementary modules. The first focuses on using unsupervised topic modelling to identify the main topics covered in the text. These can include not only the disaster impacts but also information on response and recovery. The second module is based on the use of hand-crafted rules and pattern matching to extract information on specific impact types (e.g. traffic disruption, power outages). The final module builds upon the second one, and it uses the resulting labelled data to train supervised ML algorithms aiming to classify unlabeled text data into impact types.

We illustrate the application of TM-Impacts using the example of the 2021 flood in Germany. This event led to more than 180 fatalities and the disruption of critical infrastructure that continued for months after the event. We built a text corpus with more than 26,000 newspaper articles published in 200 different news outlets between July and November 2021. By using TM-Impacts, we were able to detect 20 different impact types, which were mapped at the NUTS 3 scale. We also identified temporal patterns. As expected, during the onset of the event, reporting on impacts tended to focus on deaths and missing people, whereas texts published in November focused on long term impacts such as the disruption of water supply.

In conclusion, we demonstrate that TM-Impacts allows scanning large amounts of text data to build multi-sector impact datasets with a great spatial and temporal stratification. We expect the use of text-mining to become widespread in assessing the impacts of natural hazards.

 

de Brito, M.M., Kuhlicke, C., Marx, A. (2020) Near-real-time drought impact assessment: A text mining approach on the 2018/19 drought in Germany. Environmental Research Letters. doi:org/10.1088/1748-9326/aba4ca

How to cite: Madruga de Brito, M., Sodoge, J., Kreibich, H., and Kuhlicke, C.: Text-mining of natural hazard impacts (TM-Impacts): an application to the 2021 flood in Germany, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-2001, https://doi.org/10.5194/egusphere-egu22-2001, 2022.

Displays

Display file