EGU24-677, updated on 08 Mar 2024
https://doi.org/10.5194/egusphere-egu24-677
EGU General Assembly 2024
© Author(s) 2024. This work is distributed under
the Creative Commons Attribution 4.0 License.

A new climate impact database using generative AI

Ni Li1, Wim Thiery1, Jakob Zscheischler2,3, Gabriele Messori4,5,6, Liane Guillou7, Joakim Nivre7,8, Olof Görnerup7, Seppe Lampe1, Clare Flynn4, Mariana Madruga de Brito9, and Aglae Jezequel10,11
Ni Li et al.
  • 1Department of Water and Climate, Vrije Universiteit Brussel, Brussel, Belgium
  • 2Department of Computational Hydrosystems Helmholtz Centre for Environmental Research - UFZ, Leipzig, Germany
  • 3Faculty of Environmental Sciences, TU Dresden, Dresden, Germany
  • 4Department of Earth Sciences, Uppsala University, Uppsala, Sweden
  • 5Swedish Centre for Impacts of Climate Extremes (CLIMES), Uppsala University, Uppsala, Sweden
  • 6Department of Meteorology and Bolin Centre for Climate Research, Stockholm University, Stockholm, Swede
  • 7RISE Research Institutes of Sweden, Sweden
  • 8Department of Linguistics and Philology, Uppsala University, Uppsala, Sweden
  • 9Department of Urban and Environmental Sociology, Helmholtz Centre for Environmental Research - UFZ, Leipzig, Germany
  • 10LMD/IPSL, ENS, Université PSL, École Polytechnique, Institut Polytechnique de Paris, Sorbonne Université, CNRS, Paris France
  • 11Ecole des Ponts, Marne-la-Vallée, France

Storms, heat waves, wildfires, floods, and other extreme weather climate-related disasters pose a significant threat to society and ecosystems, which in many cases is being aggravated by climate change. Understanding and quantifying the impacts of extreme weather climate events is thus a crucial scientific and societal challenge. Disaster databases are extremely useful for establishing the link between climate events and socio-economic impacts. However, publicly available data on impacts is generally scarce. Apart from existing open disaster databases such as EM-DAT, robust data on the impacts of climate extremes can also be found in textual documents, such as newspapers, reports and Wikipedia articles. Here we present a new climate impact database that has been built based on multiple public textual entries using a pipeline of data cleaning, key information extraction and validation. In particular, we constructed the database by using the state-of-the-art generative artificial intelligence language models GPT4, Llama2 and other advanced natural language processing techniques. We note that our dataset contains more records in the early time period of 1900-1960 and in specific areas such as than the benchmark database EM-DAT. Our research highlights the opportunities of natural language processing to collect data on climate impacts, which can complement existing open impact datasets to provide a more robust information on the impacts of weather and climate events.

How to cite: Li, N., Thiery, W., Zscheischler, J., Messori, G., Guillou, L., Nivre, J., Görnerup, O., Lampe, S., Flynn, C., Madruga de Brito, M., and Jezequel, A.: A new climate impact database using generative AI, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-677, https://doi.org/10.5194/egusphere-egu24-677, 2024.