EGU25-12889, updated on 15 Mar 2025
https://doi.org/10.5194/egusphere-egu25-12889
EGU General Assembly 2025
© Author(s) 2025. This work is distributed under
the Creative Commons Attribution 4.0 License.
Oral | Monday, 28 Apr, 10:05–10:15 (CEST)
 
Room E2
Decoding Wildfires - Extracting Interpretations and Causal Pathways of Catalysts for Wildfire Occurrence from Machine Learning Models
Hans Korving and Margreet Van Marle
Hans Korving and Margreet Van Marle
  • Deltares, Netherlands (hans.korving@deltares.nl)

Machine learning (ML) models are widely used to predict wildfire occurrence and susceptibility (Brys et al., 2025). However, while these models excel at prediction, they often fail to provide insights into their inner workings or uncover the causal pathways driving wildfires. This study addresses this limitation by extending ML models beyond prediction to explore the drivers and causal pathways underlying wildfire occurrence. Our primary aim is to identify meaningful, interpretable patterns from wildfire data.

We developed a novel multi-stage clustering methodology inspired by Cooper et al. (2021) and Cohen et al. (2024). This approach integrates feature attribution (SHAP values), dimensionality reduction (UMAP), hierarchical clustering (HDBSCAN), and causal discovery methods: PC and FCI (Spirtes et al., 2001), and DirectLiNGAM (Shimizu et al., 2011). The causal methods were enhanced with prior background knowledge to derive meaningful insights. We used datasets from Italy (Cilli et al., 2022) and the Netherlands.

A central feature of our methodology is the use of SHAP values to define subgroups and derive causal pathways. SHAP values reduce noise in the feature space while preserving critical information for clustering. By reducing multidimensional SHAP values to two dimensions with UMAP, we improved clustering performance and interpretability. The resulting clusters were described using concise, non-overlapping decision rules based on the original variables, eliminating the need for manual filtering commonly required in clustering raw feature space. The identified clusters revealed specific relationships between wildfire drivers and occurrence. For each cluster, we applied advanced causal discovery techniques to derive probable causal pathways, aligning the findings with the knowledge of stakeholders and domain experts. These actionable and interpretable explanations offer practical utility.

Findings from the case studies demonstrate that supervised clustering effectively characterizes wildfire occurrence by linking it to influencing factors. Furthermore, the approach provides valuable insights into cluster-specific causal pathways. The methodology translates complex relationships into simple causal logic, offering stakeholders and domain experts the necessary context to understand the model's behavior.

 

Brys, C., La Red Martínez, D.L. & Marinelli, M. Machine learning methods for wildfire risk assessment. Earth Science Informatics 18, 148 (2025). https://doi.org/10.1007/s12145-024-01690-z

Cilli, R., Elia, M., D’Este, M., Giannico, V., Amoroso, N., Sanesi, G., Lombardi, A., Pantaleo, E., Monaco, A., Tangaro, S., Bellotti, R. & Lafortezza, R. (2022). Explainable artificial intelligence (XAI) detects wildfire occurrence in the Mediterranean countries of Southern Europe. Scientific Reports 12, 16349. https://doi.org/10.1038/s41598-022-20347-9

Cohen, J., Huan, X. & Ni, J. (2024). Shapley-based explainable AI for clustering applications in fault diagnosis and prognosis. Journal of Intelligent Manufacturing, 35, 4071-4086. https://doi.org/10.1007/s10845-024-02468-2

Cooper, A., Doyle, O. & Bourke, A. (2021). Supervised clustering for subgroup discovery: An application to covid-19 symptomatology. Communications in Computer and Information Science, 1525, 408–422. https://doi.org/10.1007/978-3-030-93733-1_29

Shimizu, S., Inazumi, T., Sogawa, Y., Hyvärinen, A., Kawahara, Y., Washio, T., Hoyer, P. O., & Bollen, K. (2011). DirectLiNGAM: A direct method for learning a linear non-Gaussian structural equation model. The Journal of Machine Learning Research, 12, 1225–1248. https://doi.org/10.48550/arXiv.1101.2489

Spirtes, P., Glymour, C. & Scheines, R. (2001). Causation, Prediction, and Search. Second Edition. MIT Press. https://doi.org/10.7551/mitpress/1754.001.0001

 

How to cite: Korving, H. and Van Marle, M.: Decoding Wildfires - Extracting Interpretations and Causal Pathways of Catalysts for Wildfire Occurrence from Machine Learning Models, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-12889, https://doi.org/10.5194/egusphere-egu25-12889, 2025.