ECSS2025-229, updated on 05 Oct 2025
https://doi.org/10.5194/ecss2025-229
12th European Conference on Severe Storms
© Author(s) 2025. This work is distributed under
the Creative Commons Attribution 4.0 License.
Exploring Tree-Based Machine Learning Methods for Estimation of Hail Sizes
Amruta Vurakaranam1, Christian Berndt2, Katharina Lengfeld2, Lukas Josipovic2, Markus Schultze2, and Katharina Schröer1
Amruta Vurakaranam et al.
  • 1Albert-Ludwigs-Universität Freiburg, Freiburg, Germany
  • 2Deutscher Wetterdienst, Offenbach, Germany

Hail remains one of the most challenging and least understood severe weather hazards in Germany, posing significant challenges for forecasting and contributing to substantial economic losses, particularly in agriculture, infrastructure, and related insurance sectors. While the occurrence and probability of hail have been studied, estimating hail size remains a key open research question from both a forecasting and a climatological perspective.

This study is part of the HAIPI project (Hailstorm Analysis, Impact, and Prediction Initiative) funded by the German weather service DWD, which aims to improve hail size estimation by leveraging various newly developed datasets. These include advanced polarimetric radar products, numerical weather prediction (NWP) outputs, lightning data, and crowd-sourced observations from platforms such as the European Severe Weather Database (ESWD) and the DWD WarnWetter app.

We present first results from a set of tree-based machine learning approaches, including Random Forests and Gradient Boosting methods. These models incorporate atmospheric variables such as convective available potential energy (CAPE), wind shear, and radar products from the DWD’s KONRAD3D forecast system. A comparative analysis of model performance is conducted for both binary classification—distinguishing between severe and non-severe hail using various threshold definitions—and multiclass classification, categorizing hail sizes into three groups: Category 1 (<2 cm), Category 2 (2–5 cm), and Category 3 (≥5 cm).

A preliminary model achieves around 70% accuracy with balanced performance across hail size classes, demonstrating strong potential for operational forecasting. Feature importance analysis identifies radar-derived vertical extent features (e.g., vertical_extent, echo_top_threshold_55dBZ) and model-based reflectivity metrics (e.g., cell_based_VIL) as key predictors. These initial findings highlight the value of integrating radar, model-based, and crowd-sourced data to improve hail size prediction.

How to cite: Vurakaranam, A., Berndt, C., Lengfeld, K., Josipovic, L., Schultze, M., and Schröer, K.: Exploring Tree-Based Machine Learning Methods for Estimation of Hail Sizes, 12th European Conference on Severe Storms, Utrecht, The Netherlands, 17–21 Nov 2025, ECSS2025-229, https://doi.org/10.5194/ecss2025-229, 2025.

Comments on the supplementary material

AC: Author Comment | CC: Community Comment | Report abuse

supplementary materials version 1 – uploaded on 03 Dec 2025, no comments

Post a comment