A new climate impact database using generative AI
- 1Department of Water and Climate, Vrije Universiteit Brussel, Brussel, Belgium
- 2Department of Computational Hydrosystems Helmholtz Centre for Environmental Research - UFZ, Leipzig, Germany
- 3Faculty of Environmental Sciences, TU Dresden, Dresden, Germany
- 4Department of Earth Sciences, Uppsala University, Uppsala, Sweden
- 5Swedish Centre for Impacts of Climate Extremes (CLIMES), Uppsala University, Uppsala, Sweden
- 6Department of Meteorology and Bolin Centre for Climate Research, Stockholm University, Stockholm, Swede
- 7RISE Research Institutes of Sweden, Sweden
- 8Department of Linguistics and Philology, Uppsala University, Uppsala, Sweden
- 9Department of Urban and Environmental Sociology, Helmholtz Centre for Environmental Research - UFZ, Leipzig, Germany
- 10LMD/IPSL, ENS, Université PSL, École Polytechnique, Institut Polytechnique de Paris, Sorbonne Université, CNRS, Paris France
- 11Ecole des Ponts, Marne-la-Vallée, France
Storms, heat waves, wildfires, floods, and other extreme weather climate-related disasters pose a significant threat to society and ecosystems, which in many cases is being aggravated by climate change. Understanding and quantifying the impacts of extreme weather climate events is thus a crucial scientific and societal challenge. Disaster databases are extremely useful for establishing the link between climate events and socio-economic impacts. However, publicly available data on impacts is generally scarce. Apart from existing open disaster databases such as EM-DAT, robust data on the impacts of climate extremes can also be found in textual documents, such as newspapers, reports and Wikipedia articles. Here we present a new climate impact database that has been built based on multiple public textual entries using a pipeline of data cleaning, key information extraction and validation. In particular, we constructed the database by using the state-of-the-art generative artificial intelligence language models GPT4, Llama2 and other advanced natural language processing techniques. We note that our dataset contains more records in the early time period of 1900-1960 and in specific areas such as than the benchmark database EM-DAT. Our research highlights the opportunities of natural language processing to collect data on climate impacts, which can complement existing open impact datasets to provide a more robust information on the impacts of weather and climate events.
How to cite: Li, N., Thiery, W., Zscheischler, J., Messori, G., Guillou, L., Nivre, J., Görnerup, O., Lampe, S., Flynn, C., Madruga de Brito, M., and Jezequel, A.: A new climate impact database using generative AI, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-677, https://doi.org/10.5194/egusphere-egu24-677, 2024.