- 1Laboratoire des Sciences du Climat et de l'Environnement, CNRS-CEA-UVSQ, Gif sur Yvette, France (adrien.burq@lsce.ipsl.fr)
- 2Descartes Underwriting, Paris, France
- 3London Mathematical Laboratory, 8 Margravine Gardens, London, W6 8RH, United Kingdom
- 4LMD-IPSL, Ecole Polytechnique, Institut Polytechnique de Paris, ENS, PSL Research University, Sorbonne Université, CNRS, Palaiseau, France
Thunderstorms are associated with several hazards such as lightning, hail, wind gusts, tornadoes and heavy rain. Because of their small scale spatio-temporal processes, studying convective events associated with thunderstorms is challenging and requires high resolution observations and models.
Traditional methods for studying thunderstorm climatology and associated hazards often rely on analyzing long-term trends of key environmental variables. These variables are derived from low-resolution reanalysis datasets like ERA5 and are aggregated over extended time periods (Taszarek et al., 2021). The methods also derive probability of occurrence of convective events, providing useful insights into large-scale climatological trends (Battaglioli et al., 2023). However, they struggle to capture the fine-scale characteristics of individual events. Their representations tend to suffer from probabilistic smoothing effects—such as overestimating hazard occurrence in regions without activity and underestimating it in areas of high activity—leading to unrealistic distributions of hazards in space, time, and intensity.
Machine learning approaches, including random forests, gradient boosting and more recently deep learning models, have improved short-term lightning nowcasting (McGovern et al., 2023). However, they use high-resolution inputs such as radar and satellite data which constrains their usage because of the spatio-temporal limitation of such data. To address this limitation, we develop a deep learning model tailored for reanalysis data which enables us to apply the model on a global scale and over a much larger period of time.
Given a thermodynamic state of the atmosphere, we generate an ensemble forecast of lightnings at an hourly time resolution and 0.25° spatial resolution. Compared to previous models (Battaglioli et al., 2023), we use 3D inputs in our model to directly output a map of lightning with statistically coherent spatial structures. Additionally, we make ensemble predictions to capture a wide range of possible realistic scenarios for a given set of thermodynamic variables. Finally, we add more variables as input to our model to leverage deep learning's ability to automatically capture complex dependencies between input variables.
Because of the large temporal availability of ERA5 data, our model will later enable us to perform attribution studies for specific events and create a stochastic catalogue of lightning events.
How to cite: Burq, A., Cazzaniga, G., Faranda, D., Vrac, M., and Xing, V.: A novel Deep Learning framework for lightning probabilistic prediction based on ERA5 reanalysis data and lightning observations., EMS Annual Meeting 2025, Ljubljana, Slovenia, 7–12 Sep 2025, EMS2025-272, https://doi.org/10.5194/ems2025-272, 2025.