ECSS2023-6
https://doi.org/10.5194/ecss2023-6
11th European Conference on Severe Storms
© Author(s) 2023. This work is distributed under
the Creative Commons Attribution 4.0 License.

A machine learning approach to mitigate problems with estimated winds in severe thunderstorm wind damage reports

William Gallus1, Elizabeth Tirone1, Subrata Pal2, Somak Dutta2, Ranjan Maitra2, Jennifer Newman3, and Eric Weber3
William Gallus et al.
  • 1Iowa State University, Geological and Atmospheric Sciences, Ames, IA, United States of America (wgallus@iastate.edu)
  • 2Iowa State University, Statistics, Ames, IA, United States of America
  • 3Iowa State University, Mathematics, Ames, IA, United States of America

In the United States, the official database of severe thunderstorm wind reports arguably has more serious deficiencies than those of tornadoes and hail. Roughly 90% of the thunderstorm wind reports in the Storm Events database during the period 2007-2021 are estimates without any nearby measurement, and the fact that 40% of the estimates have a value of exactly 50 knots compared to only 13% of the measurements strongly suggests that many may be overestimates since 50 knots is the minimum threshold to be considered a severe wind.   The problems in the database negatively impact development of new forecasting tools and verification.  We have tested six different machine learning approaches, training on roughly 20,000 measured reports during 2007-2017 to create a tool that assigns a probability that any severe thunderstorm wind report is due to winds of 50 knots or greater.  Training is based on date, time, location, and episode and event narrative data from the Storm Events database along with 31 near-storm weather parameters from the Storm Prediction Center mesoanalysis output.  In addition, population density and elevation are used.  Land use and radar reflectivity were also tested but found to not improve the performance. The best-performing algorithm, the Stacked Generalized Linear Model has been found to show large skill with Areas Under ROC curves as high as .90 and Brier Scores around 0.1.  When a supplemental sub-severe database is added for testing, reliability is shown to be good.  Subjective evaluations from testing during three years of NOAA Hazardous Weather Testbed Spring Forecast Experiments have been favorable and will be discussed, along with implications for forecasters.  A recent test found that the average probability for estimated 50 knot wind reports is only 57% whereas it is 81% for measured 50 knot reports, supporting the view of many forecasters that overestimates are a large problem among the estimated reports in the database.

How to cite: Gallus, W., Tirone, E., Pal, S., Dutta, S., Maitra, R., Newman, J., and Weber, E.: A machine learning approach to mitigate problems with estimated winds in severe thunderstorm wind damage reports, 11th European Conference on Severe Storms, Bucharest, Romania, 8–12 May 2023, ECSS2023-6, https://doi.org/10.5194/ecss2023-6, 2023.