- 1Cartography and GIS Research Group, Department of Geography, Vrije Universiteit Brussel, Brussels, Belgium
- 2Natural Hazards and Cartography Service, Department of Earth Sciences, Royal Museum for Central Africa, Tervuren, Belgium
Environmental change and rapid population growth are altering the impacts of floods, landslides and flash floods. The Global South is disproportionally affected by these changes, resulting into an uneven impact of these geo-hydrological hazards compared to the Global North. Comprehensive global documentation of geo-hydrological hazards is needed to improve our understanding of these hazards, yet this remains challenging. Existing data collection approaches—such as remote sensing, empirical news article screening; and field-based surveys—have limitations, constraining our ability to accurately analyze distribution, impacts and trends in geo-hydrological hazard occurrence. Moreover, most global datasets suffer from various geographical, linguistic and socio-economical biases.
To further address these challenges, we introduce a new global dataset documenting geo-hydrological hazards automatically extracted from online news articles by a large language model-based text mining algorithm, called HazMiner. A total of 6 366 905 news articles published in 58 languages from 2017 until 2025 were analyzed. The resulting dataset includes the location, timing and impact of 21 411 floods, 7 659 landslides and 3 606 flash floods. Compared to EM-DAT, a well-established global disaster dataset, our dataset documents 31 150 more geo-hydrological hazard events over the same period. Among these, 784 events resulted in at least one but fewer than ten fatalities and therefore do not meet one of EM-DAT inclusion criteria, collectively accounting for 3,578 fatalities.
Spatially, these impactful hazards occur in densely populated areas and with floods primarily located along rivers, and landslides and flash floods concentrated in mountainous regions. Temporally, floods and flash floods show seasonal trends for both hemispheres. Furthermore, 30 810 geo-hydrological hazard events do not report any fatalities, providing a broader interpretation of these hazards at the global scale compared to existing global disaster datasets. This dataset offers a new detailed global view of the hazards and has the potential to improve our understanding of their spatial-temporal occurrence and their associated impacts and risks.
How to cite: Valkenborg, B., Dewitte, O., and Smets, B.: A new global dataset of geo-hydrological hazards and their impacts automatically extracted from online news articles, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-4873, https://doi.org/10.5194/egusphere-egu26-4873, 2026.