- 1BRGM, F45060 Orléans, France
- 2Orléans University, PRISME laboratory, 45000 Orléans, France
Assessing exposure and vulnerability to natural hazards increasingly relies on national geospatial reference datasets. However, these datasets are often incomplete, heterogeneous and inconsistent across spatial scales, which limits their direct usability for multi-hazard risk analysis. In France, the BD TOPO building database exemplifies these challenges, with a large share of buildings lacking key attributes such as usage type, despite their importance for vulnerability assessment.
This contribution presents the approach developed within the CERES project (Cartography and Characterization of Exposed Elements from Satellite Imagery) to address reference data incompleteness and multi-source integration challenges in a geoscience risk context. Focusing on a large study area in the Centre-Val de Loire region, we first quantify and analyze the spatial and semantic gaps of BD TOPO building attributes, showing that more than 40% of buildings are labelled with unknown usage. We then demonstrate how deep learning applied to very high-resolution aerial imagery can be used to probabilistically infer missing semantic information, significantly reducing uncertainty while explicitly accounting for classification ambiguities.
Beyond data completion, we highlight the difficulties encountered when jointly exploiting heterogeneous datasets originating from national mapping agencies, land cover products, socio-economic statistics and hazard layers. These include spatial misalignments, inconsistent scales of representation, varying levels of reliability, and the absence of a shared data model. To address these issues, CERES proposes a multi-scale data structuring framework combining data modelling and processing designed to preserve data provenance, uncertainty and semantic traceability across sources.
By articulating reference data analysis, machine-learning-based enrichment and database design, this work provides a concrete illustration of current practices and challenges in managing imperfect geospatial data for geoscience applications. The results underline the necessity of coupling data-driven approaches with explicit data governance and modelling strategies to produce robust, transparent and reusable datasets for territorial risk assessment.
How to cite: Cécile, G., Fouzai, Y., Bokidingo, M., Negulescu, C., Lucas, Y., Grandjean, G., and Chamekh, F.: Managing Incomplete Urban Reference Data for Risk-Oriented Geoscience Applications: Lessons from the CERES Project, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-23039, https://doi.org/10.5194/egusphere-egu26-23039, 2026.