- University of Twente, International Institute for Geo-Information Science and Earth Observation, Applied Earth Sciences, Enschede, Netherlands (l.a.b.bonsu@student.utwente.nl)
Earth Observation (EO)-based methods for automated landslide detection have advanced rapidly in recent years, ranging from simple spectral index approaches to complex deep learning models. Despite these developments, systematic and reproducible benchmarking of such methods remains limited. Existing studies often rely on heterogeneous datasets, inconsistent evaluation metrics, and ad-hoc preprocessing choices, making it difficult to assess detection performance under realistic operational conditions, particularly in near-real-time post-disaster contexts.
This study proposes a model-agnostic benchmarking framework designed to enable transparent and operationally relevant comparison of EO-based landslide detection algorithms. The framework standardizes data preprocessing, scene characterization, evaluation metrics, and reporting. It is implemented using modular, reproducible computational notebooks. Performance is assessed not only globally but also in a stratified manner, accounting for environmental and atmospheric variability such as land cover type, terrain characteristics, and cloud contamination.
The framework is demonstrated using the February 2023 Kahramanmaraş earthquake sequence in Türkiye, which triggered thousands of coseismic landslides across a highly heterogeneous landscape. A high-quality manually mapped landslide inventory serves as ground truth. Two representative detection approaches are used as case studies: (i) an NDVI-based change detection method and (ii) a U-Net deep learning segmentation model, both applied to harmonized Sentinel-2 Level-2A imagery without scene-level cloud filtering to reflect operational constraints.
Benchmarking results will be presented using standardized metrics such as Intersection-over-Union, precision, recall, and false positive/negative rates, complemented by scene-level performance summaries. Rather than ranking models, the emphasis is on demonstrating how structured benchmarking can reveal context-dependent strengths and limitations of different approaches. The proposed framework aims to support reproducibility, informed model selection, and future integration into operational platforms, contributing to more reliable EO-based landslide mapping in disaster response settings.
How to cite: Bonsu, L., Tanyaş, H., and Girgin, S.: Towards a Reproducible Benchmarking Framework for EO–Based Automated Landslide Detection Fueled by Landslides Triggered by the 2023 Türkiye Earthquake Sequence, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20882, https://doi.org/10.5194/egusphere-egu26-20882, 2026.