EGU26-23039, updated on 14 Mar 2026
https://doi.org/10.5194/egusphere-egu26-23039
EGU General Assembly 2026
© Author(s) 2026. This work is distributed under
the Creative Commons Attribution 4.0 License.
Poster | Thursday, 07 May, 10:45–12:30 (CEST), Display time Thursday, 07 May, 08:30–12:30
 
Hall A, A.36
Managing Incomplete Urban Reference Data for Risk-Oriented Geoscience Applications: Lessons from the CERES Project
Gracianne Cécile1, Youssef Fouzai1,2, Mirga Bokidingo1, Caterina Negulescu1, Yves Lucas2, Gilles Grandjean1, and Fatima Chamekh1
Gracianne Cécile et al.
  • 1BRGM, F45060 Orléans, France
  • 2Orléans University, PRISME laboratory, 45000 Orléans, France

Assessing exposure and vulnerability to natural hazards increasingly relies on national geospatial reference datasets. However, these datasets are often incomplete, heterogeneous and inconsistent across spatial scales, which limits their direct usability for multi-hazard risk analysis. In France, the BD TOPO building database exemplifies these challenges, with a large share of buildings lacking key attributes such as usage type, despite their importance for vulnerability assessment.
This contribution presents the approach developed within the CERES project (Cartography and Characterization of Exposed Elements from Satellite Imagery) to address reference data incompleteness and multi-source integration challenges in a geoscience risk context. Focusing on a large study area in the Centre-Val de Loire region, we first quantify and analyze the spatial and semantic gaps of BD TOPO building attributes, showing that more than 40% of buildings are labelled with unknown usage. We then demonstrate how deep learning applied to very high-resolution aerial imagery can be used to probabilistically infer missing semantic information, significantly reducing uncertainty while explicitly accounting for classification ambiguities.
Beyond data completion, we highlight the difficulties encountered when jointly exploiting heterogeneous datasets originating from national mapping agencies, land cover products, socio-economic statistics and hazard layers. These include spatial misalignments, inconsistent scales of representation, varying levels of reliability, and the absence of a shared data model. To address these issues, CERES proposes a multi-scale data structuring framework combining data modelling and processing designed to preserve data provenance, uncertainty and semantic traceability across sources.
By articulating reference data analysis, machine-learning-based enrichment and database design, this work provides a concrete illustration of current practices and challenges in managing imperfect geospatial data for geoscience applications. The results underline the necessity of coupling data-driven approaches with explicit data governance and modelling strategies to produce robust, transparent and reusable datasets for territorial risk assessment.

How to cite: Cécile, G., Fouzai, Y., Bokidingo, M., Negulescu, C., Lucas, Y., Grandjean, G., and Chamekh, F.: Managing Incomplete Urban Reference Data for Risk-Oriented Geoscience Applications: Lessons from the CERES Project, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-23039, https://doi.org/10.5194/egusphere-egu26-23039, 2026.