Statistical post-processing techniques for weather, climate, and hydrological forecasts are powerful approaches to compensate for effects of errors in model structure or initial conditions, and to calibrate inaccurately dispersed ensembles. These techniques are now an integral part of many forecasting suites and are used in many end-user applications such as wind energy production or flood warning systems. Many of these techniques are flourishing in the statistical, meteorological, climatological, hydrological, and engineering communities. The methods range in complexity from simple bias correction up to very sophisticated distribution-adjusting techniques that take into account correlations among the prognostic variables.
At the same time, a lot of efforts are put in combining multiple forecasting sources in order to get reliable and seamless forecasts on time ranges from minutes to weeks. Such blending techniques are currently developed in many meteorological centers.
In this session, we invite presentations dealing with both theoretical developments in statistical post-processing and evaluation of their performances in different practical applications oriented toward environmental predictions, and new developments dealing with the problem of combining or blending different types of forecasts in order to improve reliability from very short to long time scales.
vPICO presentations: Wed, 28 Apr
Sequential aggregation is a theoretically-grounded means to combine several forecasts of a quantity to achieve better forecast performance as evaluated by a loss function. This theory has been mainly applied to point forecasts with a scalar forecast quantity, but rarely to forecasts expressed in a probabilistic form. In this work, we show how this theory can be readily adapted to forecasts expressed as step-wise cumulative distribution function (CDF), with the continuous ranked probabilistic score (CRPS) as performance measure.
Ensemble weather forecasts estimate the outcome of future observed quantities in a way that can be interpreted as step-wise CDF. Since those forecast CDFs are biased, statistical postprocessing methods are used to improve their statistical coherency with the observed quantity. Since many ensembles and many postprocessing methods exist, one can combine raw and post-processed ensembles in order to get even better forecast performance. To illustrate this point and the advantages of blending CDFs, sequential aggregation is applied to wind-speed ensemble weather forecasts with the CRPS as a performance measure alongside the Jolliffe-Primo test to assess the reliability of the various (raw, post-processed or aggregated) forecasts.
How to cite: Zamo, M., Bel, L., and Mestre, O.: Sequential Aggregation of Probabilistic Forecasts - Applicaton to Wind Speed Ensemble Forecasts, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-11193, https://doi.org/10.5194/egusphere-egu21-11193, 2021.
To account for uncertainty in numerical weather prediction (NWP) models it has become common practice to employ ensembles of NWP forecasts. However, forecast ensembles often exhibit forecast biases and dispersion errors, thus require statistical postprocessing to improve reliability of the ensemble forecasts.
This work proposes an extension of a recently developed postprocessing model for temperature utilizing autoregressive information present in the forecast error of the raw ensemble members. The original approach is modified to let the variance parameter additionally depend on the ensemble spread, yielding a two-fold heteroscedastic model. Furthermore, a high-resolution forecast is included into the postprocessing model, yielding improved predictive performance. Finally, it is outlined how the autoregressive model can be utilized to postprocess ensemble forecasts with higher forecast horizons, without the necessity of making fundamental changes to the original model. To illustrate the performance of the heteroscedastic extension of the autoregressive model, and its use for higher forecast horizons we present a case study for a data set containing 12 years of temperature forecasts and observations over Germany. The case study indicates that the autoregressive model yields particularly strong improvements for forecast horizons beyond 24 hours ahead.
How to cite: Möller, A. and Groß, J.: Probabilistic Temperature Forecasting with a Heteroscedastic Autoregressive Ensemble Postprocessing model, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-415, https://doi.org/10.5194/egusphere-egu21-415, 2021.
We conduct a systematic and comprehensive comparison of state-of-the-art postprocessing methods for ensemble forecasts of wind gusts. The compared approaches range from well-established techniques to novel neural network-based methods. Our study is based on a 6-year dataset of forecasts from the convection‐permitting COSMO‐DE ensemble prediction system, with hourly lead times up to 21 hours and forecasts of 57 meteorological variables, and corresponding observations from 175 weather stations over Germany. We find that simpler methods such as ensemble model output statistics (EMOS), member-by-member postprocessing and a novel isotonic distributional regression approach, which utilize ensemble forecasts of wind gusts as sole inputs, already result in improvement in terms of the mean CRPS of up to 40% compared to the raw ensemble predictions. This can be substantially improved upon by more complex machine learning methods such as gradient boosting-based extensions of EMOS, quantile regression forests, and variants of neural network-based approaches that are capable of incorporating additional information from the large variety of available predictor variables.
How to cite: Schulz, B. and Lerch, S.: Statistical and machine learning methods for postprocessing ensemble forecasts of wind gusts, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-1326, https://doi.org/10.5194/egusphere-egu21-1326, 2021.
Over the last decades, the use of climate models in the projection and assessment of future climate conditions, both on global and regional scales, has become common practice. However, inevitable biases between the simulated model output and observed conditions remain, mainly due to the variable nature of the atmospheric system, and limitations in representing sub-grid-scale processes that need to be parameterized. The present study aims to test a new approach for increasing the accuracy of daily climate model output. We apply the recently introduced TIN-Copula statistical method to the results of a state-of-the-art global Earth System Model (Hadley Centre Global Environmental Model version 3 - HadGEM3). The TIN-Copula approach is a combination of Triangular Irregular Networks and Copulas that focuses on modeling the whole dependence structure of the studied variables. The study area of the current application is the Middle East and North Africa (MENA) region, a prominent global climate change hot-spot. Considering the lack of accurate and consistent observational records in the MENA, we used the ERA5 reanalysis dataset as a reference. The results of the study reveal that the TIN-Copula method significantly improves the simulation of maximum temperature, both on annual and seasonal time scales. Specifically, the HadGEM3 model tends to overestimate the ERA5 temperature data in the major part of the MENA region. This overestimation is mainly evident for the lower values of the studied data sets during all seasons, while in summer the overestimation is found in the whole data set. However, after the use the TIN-Copula method, the differences between the simulated maximum temperature and the ERA5 data were minimized in more than the 85% of the studied grids.
How to cite: Lazoglou, G., Zittis, G., Hadjinicolaou, P., and Lelieveld, J.: TIN-Copula bias correction of climate modeled daily maximum temperature in the MENA region, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-1988, https://doi.org/10.5194/egusphere-egu21-1988, 2021.
Statistical post-processing of ensemble weather forecasts has become an essential step in the forecasting chain as it enables the correction of biases and reliable uncertainty estimates of ensembles (Gneiting, 2014). One algorithm recently proposed to perform the correction of ensemble weather forecasts is a linear member-by-member (MBM) Model Output Statistics (MOS) system, post-processing each member of the ECMWF ensemble (Van Schaeybroeck & Vannitsem, 2015). This method consists in correcting the mean and variability of the ensemble members in line with the observed climatology. At the same time, it calibrates the ensemble spread such as to match, on average, the mean square error of the ensemble mean. The MBM method calibrates the ensemble forecasts based on the station observations by minimizing the continuous ranked probability score (CRPS).
Using this method, the Royal Meteorological Institute of Belgium has started in 2020 its new postprocessing program by developing an operational application to perform the calibration of the ECMWF ensemble forecasts at the stations points for the minimum and maximum temperature, and for wind gusts. In this report, we will first describe briefly the postprocessing methods being used and the architecture of the application. We will then present the results over the first few months of operation. Finally, we will discuss the future developments of this application and of the program.
Gneiting, T., 2014: Calibration of medium-range weather forecasts. ECMWF Technical Memorandum No. 719
Van Schaeybroeck, B. & Vannitsem, S., 2015: Ensemble post-processing using member-by-member approaches: theoretical aspects. Quarterly Journal of the Royal Meteorological Society, 141, 807–818.
How to cite: Demaeyer, J., Van schaeybroeck, B., and Vannitsem, S.: Statistical post-processing of ensemble forecasts at the Belgian met service, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-2495, https://doi.org/10.5194/egusphere-egu21-2495, 2021.
The hydrological forecasting system coupled with precipitation forecasting can bring us a longer forecast period of early warning information, but it is also accompanied by higher uncertainty. With the improvement of hydrological models, the precipitation forecast may be the largest source of uncertainty. Therefore, before incorporating it into the hydrological model, the precipitation forecast needs post-processing to reduce its uncertainty. Meteorological post-processing corrects the bias of future precipitation forecasts by establishing a linear or non-linear relationship between historical observation and simulation. Machine learning (ML) can fit this relationship and process higher-dimensional predictor features, which is a promising method to improve the accuracy of precipitation forecasts. In this study, we selected the Yalong River basin of China as the cast study and compared the performance of 20 different machine learning algorithms (e.g., ridge regression, random forest, and artificial neural network). The daily hindcast data (1985-2018) from NOAA’s Global ensemble forecast system and corresponding observations from the China Meteorological Administration were selected to construct our data set. To improve the accuracy of the precipitation forecasts, we also screened different combinations of predictors to optimize the model configuration of machine learning, including space, time, and ensemble members. Comparative experiments show that all ML models can improve the accuracy of the raw precipitation forecast, but the performance is different. The extra-trees model has the best results, followed by LightGBM. However, linear regression models perform relatively poorly. The predictor combination of 11 ensemble members and a 2-day time window can achieve the best precipitation forecast. The post-processing of precipitation forecasts based on ML can significantly improve the accuracy of the raw forecasts, and it can also help us build a more advanced hydrological forecast system. In addition, the conclusions of this study and experimental design methods can provide references for the same type of research.
How to cite: Zhang, Y. and Ye, A.: Improve short-term precipitation forecasts using numerical weather prediction model output and machine learning, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-4373, https://doi.org/10.5194/egusphere-egu21-4373, 2021.
Despite considerable improvements over the last few decades, numerical weather prediction (NWP) models still tend to exhibit bias and dispersion errors. Statistical postprocessing reduces these errors and allows quantifying predictive uncertainty. However, classical postprocessing approaches such as ensemble model output statistics (EMOS) destroy any physical dependence structure of the NWP raw ensemble forecasts. Ensemble copula coupling (ECC) is a commonly used state-of-the-art method to map the spatio-temporal dependence structure of the raw ensemble to the postprocessed predictive distributions. However, if the variable of interest exhibits many ties, ECC may not be optimal. Here, the variable investigated is hourly cloud cover over Switzerland. The climatological distribution of cloud cover shows considerable point masses at both zero and one, hence ties are a major issue when it comes to applying ECC.
We compare a variant of ECC, which is tailored to variables with many ties, applied to postprocessed forecast ensembles obtained by either EMOS or a dense neural network (dense NN) with postprocessed scenarios generated by a conditional generative adversarial network (cGAN). In particular, cGANs are appealing as they directly generate maps of postprocessed cloud cover forecast scenarios without the need of any dependence template. We trained the postprocessing models for COSMO-E and ECMWF IFS raw ensemble forecasts against hourly EUMETSAT CM SAF satellite data with a spatial resolution of around 2 km over Switzerland. For all the approaches, EMOS, dense NN, and cGANs, basic setups with a minimal set of raw ensemble predictors already allowed us to obtain a significantly better univariate performance (in terms of continuous ranked probability score) than the raw NWP ensembles. We present and discuss the advantages and drawbacks of EMOS+ECC, dense NN+ECC, and cGANs with respect to both univariate forecast skill and the ability to produce realistic cloud cover forecast scenario maps.
How to cite: Dai, Y. and Hemri, S.: Spatially coherent postprocessing of cloud cover forecasts using generative adversarial networks, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-4374, https://doi.org/10.5194/egusphere-egu21-4374, 2021.
Rapidly updating nowcasting system, Smartmet nowcast, has been developed at Finnish Meteorological Institute (FMI). The system combines information from multiple sources to operationally produce accurate and timely short range forecasts and a detailed description of the present weather to the end-users. The information sources combined are 1) Rapidly-updating high-resolution numerical weather prediction (NWP) MetCoOp nowcast (MNWC) forecast 2) radar-based nowcast 3) 10-day operational forecast. The Smartmet nowcast is currently produced for parameters 2-m temperature, 10-m wind speed, relative humidity, total cloud cover and accumulated 1-hour precipitation.
The system produces hourly updating nowcast information over the Scandinavian forecast domain and combines it seamlessly with the 10-day operational forecast information. Prior the combination a simple bias correction scheme based on recent forecast error information is applied to MNWC model analysis and forecast fields of 2-m temperature, relative humidity and 10-m wind speed. The blending of the nowcast and the 10-day operational forecast information is done using Optical-flow based image morphing method, which provides visually seamless forecasts for each forecast variable.
FMI has operationally produced Smartmet nowcast forecasts since September 2020. The validation of the data is in progress. The available results show that the Smartmet nowcast is improving the quality of short range forecasts and producing seamless and consistent forecasts. The method is also reducing the delay of forecast production. The Smartmet nowcast method will be automated in FMI forecast production in the near future.
How to cite: Hieta, L., Partio, M., Laine, M., Tuomola, M.-L., Hohti, H., Perttula, T., Gregow, E., and Ylhäisi, J.: New operational nowcasting system at Finnish Meteorological Institute, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-7283, https://doi.org/10.5194/egusphere-egu21-7283, 2021.
Ensemble forecast approaches have become state-of-the-art for the quantification of weather forecast uncertainty. However, ensemble forecasts from numerical weather prediction models (NWPs) still tend to be biased and underdispersed, hence justifying the use of statistical post-processing techniques to improve forecast skill.
In this study, ensemble forecasts are post-processed using a convolutional neural network (CNN). CNNs are the most popular machine learning tool to deal with images. In our case, CNNs allow to integrate information from spatial patterns contained in NWP outputs.
We focus on solar radiation forecasts for 48 hours ahead over Europe from the 35-members ARPEGE (Météo-France global NWP) and a 512-members WRF (Weather Research and Forecasting) ensembles. We used a U-Net (a special kind of CNN) designed to produce a probabilistic forecast (quantiles) using as ground truth the CAMS (Copernicus Atmosphere Monitoring System) radiation service dataset with a spatial resolution of 0.2°.
How to cite: Dupuy, F., Lu, Y.-S., Good, G., and Zamo, M.: Calibration of solar radiation ensemble forecasts using convolutional neural network, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-7359, https://doi.org/10.5194/egusphere-egu21-7359, 2021.
MeteoSwiss is developing and implementing a post-processing suite of multi-model ensemble forecasts to produce seamless probabilistic calibrated forecasts at arbitrary locations in Switzerland (i.e. also for un-observed locations). With the complex topography of Switzerland, the raw output of the numerical model is subject to particular strong biases and conditional errors. Here, we present results for hourly temperature and precipitation predictions.
We apply a global ensemble model output statistics (gEMOS) framework. It extends the classical EMOS approach by incorporating static predictor variables describing relevant topographical features and it is trained for all stations together using a 4-year multi model numerical weather prediction (NWP) archive. As NWP sources, we combine data from the COSMO model suites (1.1 and 2.2 km horizontal grid-spacing) and from the ECMWF IFS medium-range forecasting system. Note that the three NWP suites have different forecast horizons.
We show that gEMOS is able to improve forecasts for both variables. Depending on selection of predictors, lead-time, hour-of-day and season we find improvements up to 30% in terms of CRPS for both variables with most pronounced improvements in mountainous regions. Particularly for temperature, the multi-model combination further increases the forecast skill compared to postprocessing using high-resolution simulations of COSMO only.
While locally optimized approaches show better performance in terms of skill at the observing sites, the advantage of gEMOS lies in the ability to generate calibrated predictions for arbitrary locations in a consistent way. Its computational efficiency makes it a particularly attractive method for operationalization in a realtime context.
How to cite: Rajczak, J., Regula, K., Jonas, B., Stephan, H., Lionel, M., Christoph, S., and Mark A., L.: A global EMOS postprocessing for temperature and precipitation forecasts for any location in Switzerland, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-9487, https://doi.org/10.5194/egusphere-egu21-9487, 2021.
Changes in the North Atlantic Oscillation (NAO) heavily influence the weather across the UK and the rest of Europe. Due to an imperfect reconstruction of the polar jet stream and associated pressure systems, there is reason to believe that errors in numerical weather prediction models may also depend on the prevailing behaviour of the NAO. To address this, information regarding the NAO is incorporated into statistical post-processing methods through a regime-dependent mixture model, which is then applied to wind speed forecasts from the Met Office's global ensemble prediction system, MOGREPS-G. The mixture model offers substantial improvements upon conventional post-processing methods when the wind speed depends strongly on the NAO, but the additional complexity of the model can hinder forecast performance in other instances. A measure of regime-dependency is thus defined that can be used to differentiate between situations when the numerical model output is, and is not, expected to benefit from regime-dependent post-processing. Implementing the regime-dependent mixture model only when this measure exceeds a certain threshold is found to further improve predictive performance, while also producing more accurate forecasts of extreme wind speeds.
How to cite: Allen, S., Evans, G., Buchanan, P., and Kwasniok, F.: Incorporating the North Atlantic Oscillation into the post-processing of MOGREPS-G wind speed forecasts, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-9661, https://doi.org/10.5194/egusphere-egu21-9661, 2021.
To obtain reliable joint probability forecasts, multivariate postprocessing of numerical weather predictions (NWPs) must take into account dependencies among the univariate forecast errors—across different forecast horizons, locations or atmospheric quantities. We develop a framework for multivariate Gaussian regression (MGR), a flexible multivariate postprocessing technique with advantages over state-of-the-art methods.
In MGR both mean forecasts and parameters describing their error covariance matrix may be modeled simultaneously on NWP-derived predictor variables. The bivariate case is straightforward and has been used to postprocess horizontal wind vector forecasts, but higher dimensions present two major difficulties: ensuring the estimated error covariance matrix is positive definite and regularizing the high model complexity.
We tackle these problems by parameterizing the covariance through the entries of its basic and modified Cholesky decompositions. This ensures its positive definiteness and is the crucial fact making it possible to link parameters with predictors in a regression. When there is a natural order to the variables, we can also sensibly reduce complexity through a priori restrictions of the parameter space.
MGR forecasts take the form of full joint parametric distributions—in contrast to ensemble copula coupling (ECC) that obtains samples from the joint distribution. This has the advantage that joint probabilities or quantiles can be easily derived.
Our novel method is applied to postprocess NWPs of surface temperature at an Alpine valley station for ten distinct lead times more than one week in the future. All the mean forecasts and their full error covariance matrix are modelled on NWP-derived variables in one step. MGR outperforms ECC in combination with nonhomogeneous Gaussian regression.
How to cite: Muschinski, T., Mayr, G. J., Simon, T., and Zeileis, A.: Multivariate postprocessing using Cholesky-based multivariate Gaussian regression, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-9840, https://doi.org/10.5194/egusphere-egu21-9840, 2021.
To improve and automate the quality of weather forecasts to the public, MeteoSwiss is redesigning its statistical postprocessing suite. The effort aims at producing calibrated probabilistic predictions to any arbitrary point in space and up to a 15-day lead time, by seamlessly integrating multiple numerical weather prediction models into a unique consensus forecast.
For hourly wind forecasts (mean, gust, and direction), the task is formulated as a regression problem in a supervised machine learning framework, where station measurements are used as labels, and co-located NWP forecasts as features. To improve the estimates at ungauged locations, additional static topographical features are derived from a 50m digital elevation model. The probabilistic component is included by training the neural network not to produce a deterministic prediction, but the parameters of a conditional probability function. To this end, the Continuous Ranked Probability Score (CRPS) is used as a loss function.
The dataset includes a range of surface parameters at hourly resolution produced by the operational forecasts from three NWP models (the deterministic COSMO-1 model, at 1 km horizontal resolution; the 21-member COSMO-E, 2 km; and the 51-member ECMWF IFS ENS at about 18 km). The data cover the whole of Switzerland over a period spanning more than four years (mid 2016 to end of 2020). Wind measurements from over 500 surface weather stations are included as reference dataset. The study uses a train-validation-test split in both space and time to assess the ability of the postprocessing model to generalize to unseen locations and times.
The results indicate that, despite the challenging nature of the problem, the postprocessing model can improve over the baseline NWP forecasts in terms of CRPS on the test set. In particular, the model is effectively correcting for biases relating to altitude error and other misrepresentations in the NWP topography. The results show that it is feasible to downscale numerical predictions to a substantially higher spatial resolution. Moreover, the conditional probabilities shows consistent improvements in terms of calibration, although it remains a significant portions of undetected peak events (positive outliers), possibly to be related to unpredictable phenomena (e.g., thunderstorm gusts). Finally, first results seem to suggest that the gain in prediction skill is mainly driven by a better statistical reliability rather than higher statistical resolution.
How to cite: Nerini, D., Bhend, J., Spirig, C., Moret, L., and Liniger, M.: Postprocessing wind forecasts with deep learning in complex terrain, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-9849, https://doi.org/10.5194/egusphere-egu21-9849, 2021.
Good short-term predictions of rainfall over a few hours can be made through advecting the current radar image. Numerical Weather Prediction (NWP) extrapolates high resolution models of the atmosphere. Advection forecasts are useful for a range of 0-3 hours and NWP forecast are generated up to days in advance. The question is to combine the two to optimize the forecast for the 3-24 hour period when information from the initial radar field may still usefully correct the NWP.
To achieve this blending, several questions need to be addressed. Firstly, the reliability of both types of the forecasts needs to be estimated. The reliability of advection forecasts is, to some degree, answered by Short-Term Ensemble Prediction Systems (STEPS) through creating ensembles of forecasts. This can also be applied to NWP’s though the size of the datasets involved in this makes it unwieldy.
Furthermore, NWP forecast rainfall has systemic biases, underestimating the area of rainfall and skewing the probability distribution of rainfall rates at each pixel to the right, overestimating the maximums. Post processing of the NWP rainfall is done so the structure more accurately represents real rain fields.
Even with a post-processed NWP there remains the smoothing issue: if the advection and NWP forecasts locate the storm front at different places then blending is smoother than either, decreasing the variance in rainfall across the domain. Thus, we also consider how the real time radar image may be used to correct the NWP forecast in space and time to mitigate this smoothing effect.
How to cite: Brier, R., Yu, B., and Seed, A.: Post-processing of NWP rainfall to facilitate blending with advection forecasts , EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-10610, https://doi.org/10.5194/egusphere-egu21-10610, 2021.
The implementation of statistical post-processing of ensemble forecasts is increasingly developed among national weather services. The so-called Ensemble Model Output Statistics (EMOS) method, which consists in generating a given distribution whose parameters depend on the raw ensemble, leads to significant improvments in forecast performance for a low computational cost, and so is particularly appealing for reduced performance computing architectures. However, the choice of a parametric distribution has to be sufficiently consistent so as not to lose information on predictability such as multimodalities or asymmetries.
Different distributions are applied to the post-processing of the ECMWF ensemble forecast of surface temperature. More precisely, mixture of Gaussian and skew-Normal distributions are tryed from 3 up to 360h lead time forecasts. For this work, analytical formulas of the continuous ranked probability score have been derived. We will discuss the first results obtained judging both overall performance and tolerance to mispecification.
How to cite: Taillardat, M.: Mixtures of (skewed) Gaussian distributions for statistical post-processing, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-12949, https://doi.org/10.5194/egusphere-egu21-12949, 2021.
We are sorry, but presentations are only available for users who registered for the conference. Thank you.
We are sorry, but presentations are only available for users who registered for the conference. Thank you.