Cross-Biome Feature Importance Stability Analysis for SAR-based Flood Mapping with Random Forests

Parisa Havakhor; Paul Hosch; Antara Dasgupta

doi:https://doi.org/10.5194/egusphere-egu26-1266

[Back] [Session HS6.5]

EGU26-1266, updated on 13 Mar 2026

https://doi.org/10.5194/egusphere-egu26-1266

EGU General Assembly 2026

© Author(s) 2026. This work is distributed under
the Creative Commons Attribution 4.0 License.

Cross-Biome Feature Importance Stability Analysis for SAR-based Flood Mapping with Random Forests

Parisa Havakhor¹, Paul Hosch², and Antara Dasgupta²

Parisa Havakhor et al.

¹Department of Civil Engineering and Architecture, University of Pavia, Pavia, Italy
²Department of Data Driven Computing in Civil Engineering, RWTH Aachen University, Aachen, Germany

Flood mapping using machine learning methods such as Random Forests (RF) requires informed feature engineering and selection. Despite feature-importance rankings across different biomes and land covers varying substantially, the stability of these feature rankings has not been evaluated specifically for RF-based flood delineation. In this study, we investigate the consistency of RF feature-importance rankings in a binary flood-classification task primarily based on Synthetic Aperture Radar (SAR) imagery. The feature stack comprises 14 variables, including 9 SAR-based features, Sentinel-1 VV and VH polarizations and their temporal-change metrics which inform the flood extent identification, and 4 contextual features such as land cover and topographic indices which provide information on backscatter uncertainties. The classification task was conducted across 18 flood events spanning six distinct biomes: (1) Deserts and Xeric Shrublands, (2) Tropical and Subtropical Moist Broadleaf Forests, (3) Temperate Broadleaf and Mixed Forests, (4) Temperate Coniferous Forests, (5) Mediterranean Forests, Woodlands and Scrub, and (6) Temperate Grasslands, Savannas and Shrublands. Three feature-attribution methods were evaluated: (1) Shapley Additive exPlanations (SHAP) provides a game-theoretic framework for feature attribution and is widely recognized for its consistency and interpretability; (2) Mean Decrease in Impurity (MDI), computed during tree growth, is the most commonly used importance metric for RF models; (3) Permutation feature importance (MDA) offers a model-agnostic approach that assesses importance by measuring the reduction in model accuracy when feature values are randomly shuffled. Both feature cardinality and feature correlation, which bias the feature rankings for these algorithms in different ways, were considered during interpretation. All experiments were repeated across 10 independent iterations to account for random variability. We first examined feature-importance rankings independently across the three sub-sample studies within each biome to establish baseline intra-biome variability, followed by quantification of inter-biome variability to assess whether feature-importance patterns transfer across different environmental conditions. Preliminary results across select biomes indicate stable rankings for SAR-based features, with VV and VH event polarizations dominating the decision boundary, while contextual descriptors, particularly terrain indices such as Height Above the Nearest Drainage, exhibit greater variability both within and between biomes. Understanding the transferability of feature-importance patterns and feature stacks across biomes is critical for developing an RF-based flood-mapping pipeline that operates reliably under diverse environmental conditions worldwide and ultimately builds user trust in the resulting products.

How to cite: Havakhor, P., Hosch, P., and Dasgupta, A.: Cross-Biome Feature Importance Stability Analysis for SAR-based Flood Mapping with Random Forests, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-1266, https://doi.org/10.5194/egusphere-egu26-1266, 2026.