4-9 September 2022, Bonn, Germany
EMS Annual Meeting Abstracts
Vol. 19, EMS2022-413, 2022, updated on 26 Mar 2024
https://doi.org/10.5194/ems2022-413
EMS Annual Meeting 2022
© Author(s) 2024. This work is distributed under
the Creative Commons Attribution 4.0 License.

Ensemble dispersion improvement auto-tune (EDIT), a generic post processing step for Machine Learning regression results with quantified uncertainty

Jouke de Baar, Cees de Valk, and Gerard van der Schrier
Jouke de Baar et al.
  • Royal Netherlands Meteorological Institute (KNMI), De Bilt, The Netherlands, (jouke.de.baar@knmi.nl)

In various machine learning (ML) approaches, we are moving towards reliable quantification of uncertainty of results. More and more, it is becoming clear that without quantified uncertainties it is difficult to compare ML-based results or predictions. In addition, users of ML results are becoming more and more inclined to consider uncertainties in the decision-making process. Apart from providing results, it is becoming important to provide a quantified statement of accuracy – in fact, one might argue that a ML result without quantified uncertainty is actually an incomplete result. Therefore, an important generic question is: how do we ensure that the reported uncertainties are indeed reliable? 

Presently, we approach this question by developing and applying ensemble dispersion improvement auto-tune (EDIT) for spatial regression. Where cross-validation is often used to check the quality of an ensemble prediction a posteriori, for example by constructing rank histograms, in EDIT we use cross-validation to optimize the ensemble spread. This correction is made as a function of EDIT proxies, that is, covariates that might be important indicators of the magnitude of the correction we should make. In our case, we use a multi-objective optimization, which targets both the flatness of the regression rank histogram and the accurate dependency of the regression uncertainty on proxies. 

In this work, we consider the example of spatial regression of in situ observations of daily mean wind speed in Europe for the period 1980 – 2021 as part of the E-OBS data set. In such products, we provide maps of historical wind speed, and communicate the uncertainty in our results by providing an ensemble of maps. Important EDIT proxies for ensemble dispersion correction are the distance to the nearest station and complexity of terrain. Using EDIT, we see a significant improvement of the reliability of the ensemble dispersion. 

How to cite: de Baar, J., de Valk, C., and van der Schrier, G.: Ensemble dispersion improvement auto-tune (EDIT), a generic post processing step for Machine Learning regression results with quantified uncertainty, EMS Annual Meeting 2022, Bonn, Germany, 5–9 Sep 2022, EMS2022-413, https://doi.org/10.5194/ems2022-413, 2022.

Displays

Display file

Supporters & sponsors