Comparing machine learning and statistical models for quantification of heat-attributable mortality across Europe

Sarah Wilson Kemsley; Jowan Fromentin; Bikem Pastine; Xiaowen Dong; Yuming Guo; Tom Matthews; Katrin Meissner; Sarah Perkins-Kirkpatrick; Louise Slater

doi:https://doi.org/10.5194/egusphere-egu26-5812

[Back] [Session ITS4.19/CL0.10]

EGU26-5812, updated on 13 Mar 2026

https://doi.org/10.5194/egusphere-egu26-5812

EGU General Assembly 2026

© Author(s) 2026. This work is distributed under
the Creative Commons Attribution 4.0 License.

Comparing machine learning and statistical models for quantification of heat-attributable mortality across Europe

Sarah Wilson Kemsley¹, Jowan Fromentin², Bikem Pastine¹, Xiaowen Dong², Yuming Guo³, Tom Matthews⁴, Katrin Meissner⁵, Sarah Perkins-Kirkpatrick⁶, and Louise Slater¹

Sarah Wilson Kemsley et al.

¹University of Oxford, School of Geography and the Environment, Oxford, United Kingdom of Great Britain – England, Scotland, Wales
²Department of Engineering Science, University of Oxford, Oxford, United Kingdom of Great Britain – England, Scotland, Wales
³Climate, Air Quality Research Unit, School of Public Health and Preventive Medicine, Monash University, Melbourne, Australia
⁴Department of Geography, King’s College London, United Kingdom of Great Britain – England, Scotland, Wales
⁵Climate Change Research Centre, University of New South Wales, Sydney, Australia
⁶Fenner School of Environment and Society, The Australian National University, Canberra, Australia

Extreme heat poses a major and growing risk to human health, yet accurately predicting its impacts on mortality remains challenging. In this study, we compared established nonlinear statistical models - including the epidemiological standard distributed lag non-linear model (DLNM) - with machine learning (ML) approaches both for predicting excess mortality and quantifying the heat-attributable mortality across Europe. We evaluated random forest regressions (RFs) and neural networks (NNs) trained on pooled European data, contrasting their performance with two-stage DLNMs and locally fitted generalized additive models. In each model, we included the lagged effect of temperature and additionally explored the inclusion of multiple environmental exposure variables (such as air pollution and humidity).

We assessed each model’s out-of-sample skill for predicting excess mortality, with our preliminary findings suggesting that the ML frameworks tend to improve skill across Europe. Notably, we find evidence that pooled ML models improve predictive performance for countries with fewer observations, suggesting that they are better able to learn from shared, diverse regional information. We also compared the spatial patterns and magnitudes of heat-attributable mortality estimated by the ML models with those from the DLNM, providing a benchmark. Together, our findings highlight the potential for ML-based frameworks to inform future heat-health impact assessments.

How to cite: Wilson Kemsley, S., Fromentin, J., Pastine, B., Dong, X., Guo, Y., Matthews, T., Meissner, K., Perkins-Kirkpatrick, S., and Slater, L.: Comparing machine learning and statistical models for quantification of heat-attributable mortality across Europe, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-5812, https://doi.org/10.5194/egusphere-egu26-5812, 2026.