4-9 September 2022, Bonn, Germany
OSA1.9
Machine Learning and Computer Vision in Weather and Climate

OSA1.9

Machine Learning and Computer Vision in Weather and Climate
Conveners: Peter Düben, Gordon Pipa, Bernhard Reichert, Dennis Schulze, Gert-Jan Steeneveld, Roope Tervo
Orals
| Tue, 06 Sep, 09:00–10:30 (CEST), 11:00–17:15 (CEST)|Room HS 2
Posters
| Attendance Wed, 07 Sep, 11:00–13:00 (CEST) | Display Wed, 07 Sep, 08:00–18:00|b-IT poster area

Orals: Tue, 6 Sep | Room HS 2

Chairperson: Bernhard Reichert
Applications and Methods of Machine Learning
09:00–09:15
|
EMS2022-211
|
CC
|
solicited
|
Onsite presentation
|
Daniele Nerini, Gabriela Aznar, and Jonas Bhend

In weather applications, machine learning is emerging as an innovative technology with the potential to address many of the shortcomings of traditional modelling procedures. The trend is fostered by the growing availability of observational data, computational resources, and high-level software libraries. However, to ensure that machine learning can deliver on its promises, build trust, and eventually transition to become a reliable technology for production, it is also important to consider the technical and engineering challenges that arise when introducing machine learning in an operational environment.

MLOps is the set of practices that aims at deploying and maintaining machine-learning models in production continuously, reliably, and efficiently. At the core of MLOps is the idea that models can easily change, while the underlying workflows remain. In this sense, the emphasis is shifted from training a specific ML model to building an integrated ML system and to continuously operate it in production.

We will present our first experiences with MLOps for weather forecasting applications at MeteoSwiss. As an example, we will use a ML-based model for postprocessing NWP surface wind forecasts, as it covers the most common and relevant challenges, including the need for efficient data loading and manipulation, the monitoring and visualization of prediction quality, and the automation of model training and deployment pipelines. In this contribution, we aim at sharing our endeavor for MLOps best practices in the applied context of a national meteorological service with the hope to foster discussion and exchanges on the topic of machine learning operations for meteorological applications and their transition to the cloud.

How to cite: Nerini, D., Aznar, G., and Bhend, J.: Machine learning operations for weather applications, EMS Annual Meeting 2022, Bonn, Germany, 5–9 Sep 2022, EMS2022-211, https://doi.org/10.5194/ems2022-211, 2022.

09:15–09:30
|
EMS2022-195
|
solicited
|
Onsite presentation
|
Sebastian Lerch and Jieyu Chen

Ensemble weather predictions typically show systematic errors that need to be corrected via post-processing. While much research interest has been focused on univariate approaches, many practical applications such as energy forecasting, hydrological applications and air traffic management require accurate modeling of spatial, temporal, and inter-variable dependencies. Over the past years, a variety of two-step approaches where ensemble predictions are first post-processed separately in each margin and multivariate dependencies are restored via copula functions in a second step has been proposed to address this need [1]. However, these approaches share common limitations in that incorporating additional predictor variables beyond forecasts of the variable of interest is not possible in a straightforward manner in specifying the copula functions that govern the multivariate dependence structure, which makes it challenging to draw from substantial benefits that have recently been demonstrated in the context of univariate post-processing [2].

To address this challenge, we propose a novel data-driven one-step approach to multivariate ensemble post-processing based on conditional generative machine learning  which allows for obtaining multivariate probabilistic forecasts directly as output of a generative neural network while incorporating additional exogenous variables as predictors. In case studies on multivariate probabilistic forecasts of surface temperature and wind speed at observation stations in Germany, our conditional generative models show state-of-the-art forecast performance and advantages over benchmark approaches, for example by allowing for generating an arbitrary number of samples from the multivariate  forecast distributions. 

References
[1]  Lerch, S. et al. (2020) Simulation-based comparison of multivariate ensemble post-processing methods. Nonlinear Processes in Geophysics, 27: 349–371
[2] Rasp, S. and Lerch, S. (2018) Neural networks for post-processing ensemble weather forecasts. Monthly Weather Review, 146(11): 3885–3900

How to cite: Lerch, S. and Chen, J.: Generative machine learning methods for multivariate ensemble post-processing, EMS Annual Meeting 2022, Bonn, Germany, 5–9 Sep 2022, EMS2022-195, https://doi.org/10.5194/ems2022-195, 2022.

Precipitation Applications
09:30–09:45
|
EMS2022-427
|
Online presentation
|
Stephan Hemri, Jonas Bhend, Christoph Spirig, Daniele Nerini, Lionel Moret, Reinhard Furrer, and Mark A. Liniger

Probabilistic predictions of precipitation call for rather sophisticated postprocessing approaches due to its low predictability, high spatio-temporal variability and highly positive skewness. Moreover, the large number of zeros makes the generation of physically realistic postprocessed forecast scenarios using standard approaches like ensemble copula coupling (ECC) rather difficult. In addition to classical statistical approaches, recently, machine learning based methods gained increasing popularity in the field of postprocessing of probabilistic weather forecasts.

In this study, we compare conditional generative adversarial network (cGAN) based postprocessing of daily precipitation with a quantile regression based approach. Recent publications on applying cGAN to precipitation forecasts have shown its potential to generate forecast scenarios that improve forecast skill and cannot be distinguished from observed data in terms of spatial structure (Harris et al., 2022; Price and Rasp, 2022). While we use ECC to generate physically realistic forecast scenarios from quantile regression, cGAN does not need any additional ECC steps. For training and verification, we use COSMO-E ensemble forecasts with a grid resolution of about 2 km over Switzerland and the corresponding CombiPrecip observations, which are a gridded blend of radar and gauge observations. Preliminary results obtained by using loss functions that are tailored to precipitation postprocessing as proposed by Harris et al. 2022 confirm the potential of cGAN for precipitation also for our study domain. Using cGAN, we aim to generate realistic looking forecast scenarios while also increasing forecast skill compared to COSMO-E. Furthermore, we provide a comparison of multivariate verification measures between COSMO-E, cGAN and quantile regression, which does increase forecast skill at the expense of relying on an additional ECC step to generate forecast scenarios. 

 

References: 

  • Harris, L., McRae, A. T., Chantry, M., Dueben, P. D., & Palmer, T. N. (2022). A Generative Deep Learning Approach to Stochastic Downscaling of Precipitation Forecasts. arXiv preprint arXiv:2204.02028.
  • Price, I., & Rasp, S. (2022). Increasing the accuracy and resolution of precipitation forecasts using deep generative models. arXiv preprint arXiv:2203.12297.

 

How to cite: Hemri, S., Bhend, J., Spirig, C., Nerini, D., Moret, L., Furrer, R., and Liniger, M. A.: Postprocessing of gridded precipitation forecasts using conditional generative adversarial networks and quantile regression, EMS Annual Meeting 2022, Bonn, Germany, 5–9 Sep 2022, EMS2022-427, https://doi.org/10.5194/ems2022-427, 2022.

09:45–10:00
|
EMS2022-467
|
Online presentation
Peter Lünenschloß, David Schäfer, Florian Gransee, Antje Claußnitzer, Thomas Schartner, and Jan Bumberger

For the reduction of climate change and the understanding of the effects of anthropogenic interventions on environmental systems, the monitoring of these systems is a fundamental requirement that relies heavily on the availability of extensive but consistent data sets.

Quality control tests and consistency routines that generate those datasets from available sensor data will inevitably produce data gaps, where measurement data does not pass tests or is simply not available.

However, most further data utilization will need those data gaps to be filled (imputed) in a consistent way. This consistency usually is assured by having a good set of predictors, together with a suitable method for predicting the variable that is to be imputed.

This is also true for precipitation, a meteorologic variable that is fundamental to the understanding of hydro logical system dynamics but notoriously hard to predict at the micro climatic scale, with sampling rates exceeding the one hour mark.

We conducted an imputation study with machine learning methods on precipitation time series collected in a reference set of gauging stations that are a subset of the wider network of the german meteorologic service (DWD), where precipitation and other meteorological data is available at a 10 minute sampling rate.

We trained an Extreme Gradient Boosted Tree classifier and a Deep Neural Network regressor on a 10 years record of those data. We selected several distinct sets of predictors available in the surrounding of the reference station based on temporal and spatial proximity and evaluated the feature importance at different proximity value levels.

Assuming that the imputation does not have to be performed at real time, but serves as a post-processing step, we could extend the set of bounding conditions to measurements obtained in the future of the gap to be imputed, and could thus improve over results obtained in regular forecasting scenarios.

To further improve the imputation results, especially for the matching of singular and erratic rainfall events, we aligned spatio-temporally separated measurements of the same (traveling) rainfall events by including a non-linear time series stretching algorithm (dynamic time warping) into the samples preprocessing.

We observed, that meteorologic variables such as wind and humidity, that are useful for the prediction of precipitation at lower sampling rates, can not compensate for the noise their inclusion in the set of predictors results in, when imputing precipitation sampled at a 10 minutes rate.

However, with precipitation collected at neighboring stations used as predictors and the preprocessing measures taken, we were able to achieve a solid correlation score and could thus show, that ML-driven post processing routines enable imputations at high temporal resolutions, providing the end user with consistent precipitation data sets.

 

 

 

How to cite: Lünenschloß, P., Schäfer, D., Gransee, F., Claußnitzer, A., Schartner, T., and Bumberger, J.: ML Driven Imputation of Precipitation Data Collected at High Sampling Rates, EMS Annual Meeting 2022, Bonn, Germany, 5–9 Sep 2022, EMS2022-467, https://doi.org/10.5194/ems2022-467, 2022.

10:00–10:15
|
EMS2022-541
|
Onsite presentation
|
Edgar Espitia, Fatemeh Heidari, Qing Lin, Marc Vischer, and Elena Xoplaki

The problem of hydrologic modelling in large catchments has been addressed by conceptual physical-based models. Deep machine learning techniques such as Long Short-Term Memory (LSTM) networks have also proven to be effective for rainfall-runoff modelling. This is a promising approach to include a diversity of inputs to capture complex processes without handling the entire complexity, cost and difficulty that represent to include in conventional models. However, LSTM has not been extensively tested on large catchments and for seasonal forecasts in countries such as Germany. When a large catchment needs to be modelled, several questions arise, such as whether the performance of LSTM is suitable for rainfall-runoff without human intervention, to what extent we can rely on these results and whether the involved physical processes are appropriately represented also at higher spatial resolutions? The proposed study addresses these questions by developing a framework that employs daily meteorological observations and ancillary geographical information. First, by training a single LSTM to generate surface runoff in the study area. Secondly, to assess model performance, LISFLOOD is used as a benchmark spatially distributed semi-physical rainfall-runoff model. Finally, the performance of the model is evaluated against the Nash-Sutcliffe (NSE) and Kling-Gupta (KGE) efficiency criteria. The framework is illustrated using the Weser River Basin in Germany with a spatial resolution of 1 km and a daily time step. The proposed framework is expected to outperform calibrated physical-based models, to be suitable for seasonal forecasting, and to provide information for understanding the capabilities and limitations of the physics-based models and how the machine learning techniques take into account or are enforced to follow the physical laws.

How to cite: Espitia, E., Heidari, F., Lin, Q., Vischer, M., and Xoplaki, E.: Evaluating the performance of Long Short-Term Memory (LSTM) Networks for rainfall–runoff modelling in large catchments, EMS Annual Meeting 2022, Bonn, Germany, 5–9 Sep 2022, EMS2022-541, https://doi.org/10.5194/ems2022-541, 2022.

10:15–10:30
|
EMS2022-245
|
Onsite presentation
Fatemeh Heidari, Qing Lin, Edgar Fabián Espitia Sarmiento, Muralidhar Adakudlu, Marc Vischer, and Elena Xoplaki

The project DAKI-FWS (Data and AI-supported Early Warning System to stabilize the German Economy), funded by the Federal Ministry of Economic Affairs and Climate Action (Germany), develops an innovative early warning system with a seasonal time horizon to protect and support lives, jobs, land and infrastructure. High-skilled, innovative time and space-dependent bias correction and high resolution downscaling artificial intelligence approaches, such as deep learning and reinforcement learning techniques, are designed and implemented on ensemble seasonal forecast data. A fundamental challenge in bias correction is to preserve climate trends and plausible representation of the physical properties (variables) of the climate data.Thus in this work, a trend preserving AI-based correction approach is implemented. The high quality bias-corrected data can be introduced into the various climate-related practical applications of the overall project, such as the detection of extreme events but also evolution of pandemics or subtropical/tropical diseases and hydrological models. State-of-the-art AI techniques are applied not only for preprocessing and preparation of the climate and sectoral data but also for the analysis and post-processing phases. Weather and climate extremes, such as heatwaves, storms and droughts, and concurrent extremes are identified from the large pool of meteorological and climatological reference datasets, seasonal forecasts as well as event lists.  Such a comprehensive early warning system with seasonal horizon that contributes to the estimation of the outbreak and development of climate and health crises and supports disaster management and risk reduction and mitigation, does not yet exist for Germany, illustrating the importance and potential of this work.

How to cite: Heidari, F., Lin, Q., Espitia Sarmiento, E. F., Adakudlu, M., Vischer, M., and Xoplaki, E.: An AI-based approach for bias correction of temperature and precipitation forecasts to develop an early warning system, EMS Annual Meeting 2022, Bonn, Germany, 5–9 Sep 2022, EMS2022-245, https://doi.org/10.5194/ems2022-245, 2022.

Coffee break
Chairperson: Roope Tervo
Nowcasting Applicatons
11:00–11:15
|
EMS2022-391
|
Onsite presentation
|
Matej Choma, Jakub Bartel, and Petr Šimánek

During last year's summer storm season, we have introduced a precipitation nowcasting neural network MWNet and deployed it to operational use. The network tackles the nowcasting problem as a sequence to sequence prediction of radar echo, emphasizing high resolution and accuracy. We have conducted two quantitative experiments comparing MWNet 60 min forecasts to other available precipitation nowcasting models, using the metrics CSI and MSE. Both evaluations, over the domain of Denmark for years 2018 - 2020 and over the Czech Republic for the summer storm season of 2021, concluded in favor of our approach. However, we aim to improve MWNet capabilities further by focusing on severe weather nowcasting, the physical soundness of the predictions, and lead times longer than 60 min. Building on the advances in deep learning and its use in spatio-temporal forecasting, MWNet is based on the idea of disentangling physical dynamics from the residual factors. In this contribution, we consider improvements to the physical part of the network, its incorporation into the whole model, and the loss function used during training. Mainly, we are exploring the effect of implementing non-linear partial differential equations into the physical part, with various levels of hand-engineering equation terms. We analyze the impact on the dynamics learned by each part of the network and prediction quality for each setting. MWNet v1.2, based on the proposed architecture, will be operationally used and evaluated by meteorologists in Meteopress during the summer of 2022. This work aims to contribute to bridging the gap between machine learning and physical modeling in weather forecasting, alongside improving precipitation prediction.

How to cite: Choma, M., Bartel, J., and Šimánek, P.: Precipitation Nowcasting by Deep Physics-Constrained Neural Networks, EMS Annual Meeting 2022, Bonn, Germany, 5–9 Sep 2022, EMS2022-391, https://doi.org/10.5194/ems2022-391, 2022.

11:15–11:30
|
EMS2022-343
|
Onsite presentation
|
Çağlar Küçük, Apostolos Giannakos, Stefan Schneider, and Alexander Jann

Nowcasting severe weather is crucial not only to mitigate the effects of extreme weather events like storms and flash floods but also to support decision-makers on weather-dependent operations like aviation and outdoor events. Stunning pace of developments in Artificial Intelligence (AI) and increasing availability of high resolution data from different sensors motivate using AI in weather prediction, particularly in nowcasting due to its rapidly changing dynamics in short timescales.

Meteosat Third Generation (MTG) will greatly improve our capacity on nowcasting with its high resolution sensors on board. However, data-driven algorithms are needed to use the information stored within large volumes of MTG data for nowcasting in an efficient way. Therefore, we are developing AI based nowcasting algorithms fusing remote sensing data with ground based radar mosaics for nowcasting severe weather. 

As MTG data are not available yet (expected to be launched in late 2022), we use data from GOES satellites as the primary data source, which have comparable sensors with MTG. Specifically, we use the Storm Event ImageRy (SEVIR) dataset in the initial phase of the study which contains more than 10000 image sequences, 20 % of which contain storm events reported by NOAA. Each of these image sequences cover 384 x 384 km in space and 4-hour in time, containing three bands from the visible and infrared spectrums, and the lightning mapper data from GOES-16 with Vertically Integrated Liquid (VIL) mosaics derived from ground-based radar. We obtained promising results to reproduce spatial variations of VIL with Generative Adversarial Network (GAN) as a baseline. We are currently developing Recurrent Neural Network (RNN) based models to reproduce temporal variations of VIL to incorporate temporal information in GAN models. Furthermore, we are using European Weather Cloud, which not only provides a strong computation infrastructure but also fosters collaboration across projects. Development of efficient AI algorithms for nowcasting severe weather using GOES data will enable the opportunity to fully use MTG data on nowcasting severe weather.

How to cite: Küçük, Ç., Giannakos, A., Schneider, S., and Jann, A.: Towards a data-driven nowcasting of severe weather based on geostationary satellite data, EMS Annual Meeting 2022, Bonn, Germany, 5–9 Sep 2022, EMS2022-343, https://doi.org/10.5194/ems2022-343, 2022.

Applications for temperature, wind, renewable energies
11:30–11:45
|
EMS2022-552
|
CC
|
Onsite presentation
Statistical downscaling of the 2m temperature with a generative adversarial network (GAN)
(withdrawn)
Michael Langguth, Bing Gong, Yan Ji, Amirpasha Mozaffari, and Martin G. Schultz
11:45–12:00
|
EMS2022-324
|
Online presentation
Graph neural networks for solar energy nowcasting and intra-day prediction in Central Europe
(withdrawn)
Irene Schicker and Petrina Papazek
12:00–12:15
|
EMS2022-327
|
Online presentation
An adapted deep convolutional RNN model for spatio-temporal prediction of wind speed extremes in the short-to-medium range for wind energy applications
(withdrawn)
Daan Scheepens, Irene Schicker, Petrina Papazek, Katerina Hlavackova-Schindler, and Claudia Plant
Clouds
12:15–12:30
|
EMS2022-39
|
Onsite presentation
Maria Reinhardt, Frederik Kurzrock, Walter Acevedo, and Roland Potthast

We present an innovational way of assimilating visible and infrared observations of clouds into the weather forecasting model for regional scale: ICON-D2 (ICOsahedral Nonhydrostatic), which is operated by the German Weather Service (Deutscher Wetterdienst, DWD). For the visible camera photographs, a convolutional neural network is trained to detect clouds in pictures. The result is a greyscale picture, in which each pixel has a value between 0 and 1, describing the probability of the pixel belonging to a cloud. By averaging over a certain section of the picture one gets a value for the cloud cover of that region. To build the forward operator, which maps an ICON model state into the observation space, a three dimensional grid in space from the camera point of view had to be constructed and the ICON model variables were interpolated onto this grid. The pixels of the picture are modeled as rays, originating at the camera location and the maximum interpolated cloud cover (CLC) along each ray is taken as a model equivalent for each pixel. CLC is a diagnostic variable of an ICON model state describing the probability of the cloud coverage within the respective grid box. After superobbing, monitoring experiments have been conducted to compare the observations and model equivalents over time. The results of these experiments look promising with RMSE values below 0.32 and we continued by performing single assimilation steps as well longer experiments. 
For assimilating the infrared camera pictures we use a forward operator created by Leonhard Scheck at LMU Munich which provides a fast solution for the radiative transfer equations. Monitoring experiments as well as Data Assimilation experiments were conducted and will be presented. 

How to cite: Reinhardt, M., Kurzrock, F., Acevedo, W., and Potthast, R.: Data Assimilation of visible and infrared cloud observations from pictures, EMS Annual Meeting 2022, Bonn, Germany, 5–9 Sep 2022, EMS2022-39, https://doi.org/10.5194/ems2022-39, 2022.

12:30–12:45
|
EMS2022-157
|
Onsite presentation
|
Moritz Löffler, Christine Knist, Ulrich Görsdorf, Jasmin Vural, and Ulrich Löhnert

Microwave radiometers (MWR) are moving into the focus of national and trans-national meteorological agencies which already operate or which intend to deploy MWR in network setups. The centralized processing of MWR data products within ACTRIS and the imminent integration of MWR into the EUMETNET E-Profile network are two prominent examples for this development. The developments within E-Profile correspond to efforts made by weather services towards directly assimilating MWR brightness temperature (TB) data.

At DWD we are evaluating data availability, quality, observation impact and operational sustainability of an MWR in a testbed setup, the so-called “Pilotstation”. In this framework first assimilation experiments of MWR TB at DWD were successful. The data assimilation (DA) requires a priori quality checks and an a priori detection of liquid water clouds. Currently the most frequent reason for rejecting data from DA is the suspected presence of clouds. Consequently, reliably identifying clouds without excessively rejecting clear-sky data is especially effective for increasing the availability of suitable data.

We will present a new approach for detecting the presence of liquid water clouds from the observed zenith TB. We employ a machine learning (ML) based algorithm which exploits the spectral signature and the variability of the observation. Using the CloudNet target classification as a reference we will demonstrate the overall performance. We will also highlight some issues faced and methods used during the development of this ML application.

TB observations at lower elevation angles increase the information content of the overall MWR observation. To ensure a proper interpretation of the TB the cloud detection scheme must be enhanced so that it also applies to off-zenith observations. We will present our approaches and first results at addressing this matter.

How to cite: Löffler, M., Knist, C., Görsdorf, U., Vural, J., and Löhnert, U.: New cloud detection method for a stand-alone ground based microwave radiometer, EMS Annual Meeting 2022, Bonn, Germany, 5–9 Sep 2022, EMS2022-157, https://doi.org/10.5194/ems2022-157, 2022.

12:45–13:00
|
EMS2022-171
|
Onsite presentation
|
Mads Emil Marker Jungersen, Thomas Lykke Rasmussen, Andreas Holm Nielsen, and Henrik Karstoft

The transition to renewable energy sources such as solar energy has increased the interest in predicting cloud masks from remote sensing data. Even though deep learning methods have achieved great success on multiple meteorological tasks, only limited research has been conducted on nowcasting cloud masks based on high temporal and spatial resolution satellite data.

This study investigates forecasting cloud masks over Germany six frames into the future based on satellite images. We compare predictions between three deep learning architectures (ConvLSTM, U-Net, and MetNet) relative to two baseline models (optical flow and persistence). We train and evaluate our models using two years of the ICARE SAFNWC Cloud Mask dataset1 , with a temporal resolution of 15 minutes per frame and a spatial resolution of 3×3 km per pixel. In our experiments we use a larger area of 256×256 pixels to predict the target area of size 128×128 pixels, leading to overall better performance compared to using an input size equal to the output size. Besides comparing different network architectures, we also investigate the effect of varying the temporal input size and output size for ConvLSTM. Finally, we examine the effect of adding more features (land/sea mask and elevation map) and changing the loss function.

In summary, we have performed a comprehensive study investigating cloud mask nowcasting using 70,000 spatially and temporally aligned data frames, examining three loss functions, six evaluation metrics, and three deep learning models.
During the presentation, we will highlight the main results from the study and present details of the model architectures, datasets, and how space and time affect the performance of the models.

1) Kniffka, Stengel, and Hollmann, “SEVIRI Cloud Mask Dataset - Edition 1.”

How to cite: Jungersen, M. E. M., Rasmussen, T. L., Nielsen, A. H., and Karstoft, H.: Cloud Mask Nowcasting over Germany Using Deep Learning, EMS Annual Meeting 2022, Bonn, Germany, 5–9 Sep 2022, EMS2022-171, https://doi.org/10.5194/ems2022-171, 2022.

Lunch break
Chairperson: Dennis Schulze
Machine Learning in Numerical Weather Prediction
14:00–14:15
|
EMS2022-574
|
Onsite presentation
Walter Acevedo Valencia, Frederik Kurzrock, Maria Reinhardt, and Roland Potthast

Ground-based remote sensing of wind is currently dominated by radar profilers and wind lidars, which deliver profiles of excellent quality and high update rates. Unfortunately, the relative high costs of these devices have so far strongly limited their geographical coverage. On the other hand, infrared all-sky imagers are more affordable instruments, that can provide valuable information at day and night time, not only about cloud cover, but also about wind via computer vision techniques. In this work we investigate for the first time, whether this kind of derived wind observations can be used for data assimilation. A Reuniwatt’s thermal-infrared all-sky imager “Sky InSight”©, installed at the Lindenberg Meteorological Observatory – Richard-Assmann-Observatory (MOL-RAO) in Germany, a ceilometer in the same location and the computer vision algorithm “Optical Flow” (OF) were used to retrieve atmospheric wind vectors at cloud base height: subsequent brightness temperature photographs delivered by our imager were geometry-corrected and afterwards analysed by the OF-procedure, obtaining a set of atmospheric wind vectors in the surroundings of the camera. These vectors were finally rescaled and averaged to generate one overall wind observation, valid at the cloud base height retrieved by the ceilometer, for the time period when the photographs were taken. Afterwards, these derived wind observations were assimilated into the German regional weather prediction system, which uses the limited area version of the ICON (ICOsahedral Nonhydrostatic) model and the Local Ensemble Transform Kalman Filter (LETKF). In this work we evaluate the quality of these observations as well as their data assimilation impact for a set of monitoring experiments.

How to cite: Acevedo Valencia, W., Kurzrock, F., Reinhardt, M., and Potthast, R.: Assimilation of atmospheric wind vectors retrieved via Optical flow algorithm and a thermal all-sky imager, EMS Annual Meeting 2022, Bonn, Germany, 5–9 Sep 2022, EMS2022-574, https://doi.org/10.5194/ems2022-574, 2022.

14:15–14:30
|
EMS2022-277
|
Onsite presentation
Leonhard Scheck, Florian Baur, Christina Stumpf, and Christina Köpken-Watts

Solar satellite channels of instruments onboard geostationary or polar orbiting satellites provide high resolution information on clouds and aerosols that is valuable for numerical weather prediction. The solar channels are sensitive to the microphysical properties of cloud and aerosol particles and contain better information on water content than the thermal channels. The direct assimilation of solar satellite images or their application for the evaluation of numerical weather prediction (NWP) models requires sufficiently fast and accurate forward operators, which solve radiative transfer (RT) problems to compute synthetic images from the NWP model output. As multiple scattering complicates the solution of radiative transfer problems in the solar spectral range, standard RT methods are too slow for this purpose. Faster methods have been developed for cloud-affected visible channels, but are limited to non-absorbing channels and do not take aerosols into account. Machine learning methods provide a promising way to accelerate the complex radiative transfer computations in satellite forward operators and to overcome the limitations of previous approaches. Here we report on experiments based on deep feed forward neural network. It is demonstrated that using neural networks the amount of training data that has to be computed with standard radiative transfer methods can be reduced by several orders of magnitude, compared to previous approaches, while increasing the speed by an order of magnitude and improving accuracy. Moreover, tangent linear and adjoint versions required for variational data assimilation can easily be implemented and do not have to be adapted when network structure or training data are changed. We discuss optimizations to reduce the computational effort and provide examples for applications that require more input parameters than cloud-affected visible channels and have only become feasible with the new approach.

How to cite: Scheck, L., Baur, F., Stumpf, C., and Köpken-Watts, C.: Neural network-based methods for generating synthetic satellite images in the solar spectral range, EMS Annual Meeting 2022, Bonn, Germany, 5–9 Sep 2022, EMS2022-277, https://doi.org/10.5194/ems2022-277, 2022.

14:30–14:45
|
EMS2022-419
|
Onsite presentation
|
Guillaume Bertoli, Sebastian Schemm, Firat Ozdemir, Eniko Szekely, and Fernando Perez Cruz

Atmospheric radiative transfer, which describes the evolution of radiation emitted by the Sun, the Earth's surface, clouds, and greenhouse gases, is an essential component of climate and weather modeling. In climate models, the transfer of radiation is approximated by parameterizations. Theoretically, however, with sufficient computing power, the electromagnetic radiation equations could be solved, but in practice this is out of reach.  The current operational radiative transfer solver in the Icosahedral Nonhydrostatic Weather and Climate Model (ICON) is ecRad, which, developed at ECMWF, is one of the most advanced available radiative transfer parameterizations.  It considers surface optics, gas optics, aerosol optics and cloud optics [1]. It is an accurate radiation parametrization but remains computationally expensive. Therefore, the radiation solver is usually not invoked at every time step and only runs on a reduced spatial grid, which can affect prediction accuracy, or only in a 1D setting without 3D transfer.

In this project, we are trying to develop an ecRad solver improved by machine learning to speed up the computation without loss of accuracy. Machine learning-based parametrizations would in general allow to fully replace existing sub-grid scale parameterizations, once trained from data. However, such parametrizations do not necessarily preserve essential physical quantities, which can lead to instabilities, model drifts or unphysical behavior as observed in [2] and [3].

We present here an emulation strategy, composed of three steps. First, we continue to call ecRad on a significantly coarser grid to predict the clear-sky radiation. We thereby use ecRad as a regularizer while reducing computation costs. Then, we interpolate the data on the full spatial grid using Gaussian processes. Finally, we predict the effect of the clouds on the radiation with random forests. The underlying idea is to avoid unphysical climate drifts and to support the generalization capabilities of the ML method.

Our first numerical experiments on an aqua planet simulation are promising. We hope to obtain a valuable outcome when considering more complex datasets with seasonality and realistic topography. Our final goal is to run a full ICON simulation with a machine learning enhanced ecRad parametrization, though the online performance remains open. There the clear-sky low resolution radiation field, computed with ecRad in the first part of our strategy, is expected to play a central role in model stability.

[1] Hogan, R. J., & Bozzo, A. (2018). A flexible and efficient radiation scheme for the ECMWF model. Journal of Advances in Modeling Earth Systems, 10, 1990-2008.

[2] Brenowitz, N. D., & Bretherton, C. S. (2018). Prognostic validation of a neural network unified physics parametrization. Geophysical Research Letters, 17, 6289–6298

[3] Brenowitz, N. D., and Bretherton, C. S. (2019). Spatially Extended Tests of a Neural Network Parametrization Trained by Coarse‐Graining, J. Adv. Model. Earth Syst., 11, 2728–2744

How to cite: Bertoli, G., Schemm, S., Ozdemir, F., Szekely, E., and Perez Cruz, F.: Building a physics-constrained, fast and stable machine learning-based radiation emulator, EMS Annual Meeting 2022, Bonn, Germany, 5–9 Sep 2022, EMS2022-419, https://doi.org/10.5194/ems2022-419, 2022.

14:45–15:00
|
EMS2022-531
|
CC
|
Onsite presentation
Kameswarrao Modali, Dominik Sander, Sebastian Brune, Philip Rupp, Hella Garny, Johanna Baehr, and Marc Rautenhaus

Ensemble forecasting has become a standard means to obtain information about forecast uncertainties in meteorological centres across the world. The large datasets generated by ensemble prediction systems carry much information that is difficult to analyse manually – here, techniques from the field of artificial intelligence can be beneficial to aid the analysis. Cluster analysis is one commonly used (unsupervised machine learning) approach to automatically determine distinct scenarios in numerical weather forecasting ensembles, both in atmospheric research and operational forecasting. Typically, a cluster analysis focusses on a selected meteorological forecast variable, a specific region, and time (or a time window). The dimensionality of the data is reduced by techniques like principal component analysis, and a clustering algorithm – typically k-means – is applied to the reduced data set. Challenges with such an approach arise through the determined clusters often being sensitive to factors including the selected region, forecast variable, and algorithm parameters, and also through the employed algorithms often appearing as a “black box” to the user. In our work, we attempt to make the clustering process more transparent by providing a visual analysis framework to analyse the sensitivity of generated clusters with respect to various factors. The presented framework is coupled to the open-source meteorological ensemble visualization software Met.3D, allowing for interactive specification of clustering parameters and for interactive visual analysis, including 3-D elements. A case study using ensemble prediction data of sudden stratospheric warmings (SSWs) is presented, demonstrating how visualizing similarity between clusterings with different parameters can aid the interpretation of the data.

How to cite: Modali, K., Sander, D., Brune, S., Rupp, P., Garny, H., Baehr, J., and Rautenhaus, M.: A framework for comparative cluster analysis of ensemble weather prediction data     , EMS Annual Meeting 2022, Bonn, Germany, 5–9 Sep 2022, EMS2022-531, https://doi.org/10.5194/ems2022-531, 2022.

Seasonal and climate applications, urban heat
15:00–15:15
|
EMS2022-167
|
Onsite presentation
|
Qing Lin, Fatemeh Heidari, Edgar Fabián Espitia Sarmiento, Marc Vischer, and Elena Xoplaki

The seasonal climate forecasts at the Copernicus Climate Change Service (C3S) [1] provide longer-term predictions with several variables and forecast systems, serving as an outlook of weather statistics several weeks to months ahead. Hydrological models can assess natural hazards at regional scales when given high-resolution inputs of kilometer scales. Thus, to use the seasonal forecast ensembles for regional damage estimation, it is essential to increase the spatial resolution of these large sets of climate variables. Furthermore, the high-resolution data are adjusted to be physically consistent with the driving model in weather processes. Thanks to the latest advances in artificial intelligence, downscaling to high-resolution local data has become feasible. This study focuses on the comparison of three AI downscaling methods: multiple linear regression [2], artificial neural networks [3], and support vector machines [4]. In AI downscaling, the gridded data is used for model training and hyperparameter tuning. First, climate variables, including temperature, wind speed, precipitation, and solar radiation, are downscaled from 1° to 1 km. The downscaling results are then evaluated using statistical indicators compared with the historical daily station observations. Finally, the best-performing AI downscaling method is implemented to develop an early warning system to detect future climate extreme risks and their impacts on diverse economic activities in Germany.

[1] Seasonal forecast daily and subdaily data on single levels, Copernicus Climate Change Services, (2018). DOI: 10.24381/cds.181d637e.
[2] J. Bedia, J. Baño-Medina, M. N. Legasa, M. Iturbide, R. Manzanas, S. Herrera, A. Casanueva, D. San-Martín, A. S. Cofiño, J. M. Gutiérrez, Statistical downscaling with the downscaleR package (v3.1.0): Contribution to the VALUE intercomparison experiment, Geosci. Model Dev. 13 (3), 1711-1735 (2020). DOI: 10.5194/gmd-13-1711-2020.
[3] K. Ahmed, S. Shahid, S. B. Haroon, X. J. Wang, Multilayer perceptron neural network for downscaling rainfall in arid region: A case study of Baluchistan, Pakistan, J. Earth Syst. Sci. 124 (6), 1325-1341 (2015). DOI: 10.1007/s12040-015-0602-9.
[4] A. Goly, R. S. V. Teegavarapu, A. Mondal, Development and evaluation of statistical downscaling models for monthly precipitation 18 (18), 1-28 (2014). DOI: 10.1175/EI-D-14-0024.1.

How to cite: Lin, Q., Heidari, F., Espitia Sarmiento, E. F., Vischer, M., and Xoplaki, E.: Comparison of AI Downscaling Methods on C3S Seasonal Forecasts for Early Warning System Development, EMS Annual Meeting 2022, Bonn, Germany, 5–9 Sep 2022, EMS2022-167, https://doi.org/10.5194/ems2022-167, 2022.

15:15–15:30
|
EMS2022-54
|
Online presentation
Tsz Kin Lau, Yu Cheng Chen, and Tzu Ping Lin

Urban Heat Island (UHI) is an evident phenomenon in Taiwan, especially the capital Taipei city, owing to the highly developed with numerous high-rise buildings and the basin terrain. Due to the problem above, a method to estimate the thermal stress possibility based on the urban pattern is necessary. In recent years, machine learning and deep learning developed quickly and achieved in many fields like image recognition, machine translation, and predicting environmental characteristics. With the excellent development of machine learning and deep learning, a novel method for estimating the thermal stress possibility was presented in this work. In this study, the average temperature distribution in Taipei at 1 pm in August 2021 was calculated based on the information from measurement points of the high-density street-level air temperature observation network (HiSAN) and the Central Weather Bureau (CWB). And the thermal condition was classified into three classes: the low, medium, and high possibility of thermal stress based on the median, upper quartile, and top 90% of the data. Simultaneously, Taipei city was classified with Local Climate Zone (LCZ) based on the satellite image from Landsat 8. And this study also collected the terrain data to correct the effect of altitude on temperature. The LCZ map and terrain were resampled into 100m spatial resolution for the subsequent work. Artificial Neural Network (ANN) and Deep Neural Network (DNN) were used and compared this work’s performances for estimating the thermal stress possibilities in Taipei city. The LCZ map and terrain data were reshaped as the images with 64 pixels to quantize pattern features within the areas of 800m2 and inputted into the models for training and predicting. Both ANN and DNN models have the same learning rate and training epochs which were 0.001 and 500 epochs. After the training processes of the models, ANN and DNN models were able to estimate the possibility of thermal stress for each area by LCZ map and terrain data. The results of ANN and DNN models proved the feasibility of using algorithms of machine learning and computer vision to assess microclimate conditions. Also, the novel method presented in this work can help estimate the thermal stress possible for the areas without measurement points.

How to cite: Lau, T. K., Chen, Y. C., and Lin, T. P.: Estimating the possibility of thermal stress with computer vision and neural networks based on Local Climate Zone and terrain., EMS Annual Meeting 2022, Bonn, Germany, 5–9 Sep 2022, EMS2022-54, https://doi.org/10.5194/ems2022-54, 2022.

Coffee break
Chairperson: Gert-Jan Steeneveld
16:00–16:15
|
EMS2022-233
|
CC
|
Online presentation
|
Elizabeth Weirich Benet, Maria Pyrina, Bernat Jiménez Esteve, Ernest Fraenkel, Judah Cohen, and Daniela Domeisen

Heatwaves are extreme near-surface temperature events that can have substantial impacts on society and biodiversity. Moreover, the intensity, duration, and frequency of heatwaves are increasing at an accelerating rate as a consequence of climate change. Early Warning Systems can help to reduce the impact of heatwaves, as part of climate adaptation programs. However, state-of-the-art prediction systems can often not make accurate predictions of heatwaves more than two weeks in advance, which is required to take action and mitigate the impact of heatwaves. Here, we investigate central European forecasting of summer heatwaves on sub-seasonal timescales of several weeks using statistical and machine learning methods. As a first step, we select a set of atmospheric and surface predictors which are thought to have the largest impact on heatwave prediction based on previous studies and supported by a correlation analysis. Our findings show that at short lead times (1 week) near-surface temperature, 500hPa geopotential, precipitation, and surface soil moisture in central Europe are the most important predictors. At longer lead times (2—6 weeks), Mediterranean and North Atlantic sea surface temperatures, and the North Atlantic jet stream become the most relevant predictors. Secondly, we apply machine learning methods based on these predictors to forecast (1) summer temperature anomalies and (2) the probability of heat waves for 1—6 weeks lead time at weekly resolution. For each of these two types of forecasts (1) and (2) we use both a linear model and a Random Forest model. The performance of these models decays with lead time, as expected, but outperforms persistence and climatology at all lead times. Our machine learning models beat the European Centre for Medium-Range Weather Forecasts (ECMWF) model for lead times of 3 weeks and longer. We show that machine learning models can help extend the forecasting lead time of summer temperature anomalies and heat waves to sub-seasonal scales.

How to cite: Weirich Benet, E., Pyrina, M., Jiménez Esteve, B., Fraenkel, E., Cohen, J., and Domeisen, D.: Predicting Central European summer heatwaves with Machine Learning, EMS Annual Meeting 2022, Bonn, Germany, 5–9 Sep 2022, EMS2022-233, https://doi.org/10.5194/ems2022-233, 2022.

16:15–16:30
|
EMS2022-74
|
Online presentation
Shiang Yu Wang, Kuo An Hung, and Tzu Ping Lin

Taiwan is located in a subtropical hot and humid climate, and there are many reasons for the Urban heat island(UHI) effect caused by the rising temperature in the city. For example, radiative heat, heat storage in buildings, and artificial heat dissipation affect microclimate changes to form a vicious cycle, which has a profound impact on climate and high-tempurature.

In this study, Taichung City was selected as the study area, and after excluding the high-altitude jurisdictions and suburban areas, the characteristic factors were introduced into the analysis software in each administrative area to obtain effective research results. Previous studies on the relationship between urban climatic characteristics and temperature have been conducted by monitoring networks, building model analysis, and environmental climate simulations, with emphasis on data collection and simulation. Because of the diversity and complexity of the environmental characteristic factors, and the difficulty of predicting the target by the past analysis,this study uses the data collected by detection instruments to analyze data in the form of Decision Trees, which can effectively reduce excessive information collection and enhance the research efficiency and reduce the cost of equipment construction. Among the environmental characteristics, the factors with higher correlation to the target values were selected for data standardization and then analyzed by Artificial Neural Network(ANN)-K Nearest Neighbor(KNN) to achieve the prediction of the target values, in order to meet the expected results of this study.

This study is dedicated to the analysis of the causes of UHI formation, and the analysis by Decision Tree research method reveals that the core area of Taichung City is affected by the surface factors of the high temperature area and the surrounding topographic conditions, resulting in microclimate hyperthermia. The results of the study showed that the percentages of the factors that affected the high temperature of the administrative area were as follows: Normalized Difference Vegetation Index (NVDI) was 24%, impervious area was 22%, building area density was 19%, and average building height and surface roughness were 16%. For the future application of the research results, an ANN model can be developed for the prediction of the selected high correlation factors, which can be used as a reference for the subsequent formulation of environmental regulations and the analysis of urban energy saving strategies.

How to cite: Wang, S. Y., Hung, K. A., and Lin, T. P.: Analysis of the association between environmental features and temperature using Decision Tree and Artificial Neural Network, EMS Annual Meeting 2022, Bonn, Germany, 5–9 Sep 2022, EMS2022-74, https://doi.org/10.5194/ems2022-74, 2022.

Other applications of Machine Learning
16:30–16:45
|
EMS2022-420
|
Onsite presentation
Hae Soo Jung, Sungmin Oh, and Seon Ki Park

The need for high-resolution meteorological data to identify and understand abnormal weather phenomena is increasing as extreme events are predicted to become more intense and destructive over the coming decades. This is particularly the case for extreme precipitation events, which have high spatial variability and observational uncertainty. While dynamical and statistical downscaling models are often used to provide finer resolution data, recent studies have focused on deep learning (DL)-based downscaling techniques due to their ability to extract spatial features from large spatio-temporal data. High-quality and high-resolution target data are crucial for the train of DL-based models, however, most of the available data are based on land observations and, therefore, have significant missing values. Here, we investigate the sensitivity of a DL-based downscaling model to the choice of missing value imputation (MVI) methods to improve the accuracy of DL-based downscaling. We use precipitation data over the European continent (44N-54N×5E-15E) from 2017 to 2020; ERA5 reanalysis data with 0.5 × 0.5 spatial resolution and E-OBS observational gridded data with 0.1 × 0.1 resolution are used as the predictor and target, respectively. To build the DL model, we combine the ResNet and Upsampling features. We find that the DL model describes spatial features of precipitation events better than the simple regridding of the coarser data. Moreover, the model trained with the gap-filled target data using ERA5 regridded values shows improved results compared to that with the gap-filled data by replacing missing values with 0. Our results highlight the importance of MVI methods for the DL-based downscaling and the potential of deep learning for precipitation. Our future work will test a wider range of MVI methods including the method using EM(Expectation-Maximization) algorithm and extending coastal target data, and examine the performance of DL-models focusing on extreme events.

How to cite: Jung, H. S., Oh, S., and Park, S. K.: Daily Precipitation Downscaling Using Deep Learning Techniques: The Impact of Missing Value Imputation Methods, EMS Annual Meeting 2022, Bonn, Germany, 5–9 Sep 2022, EMS2022-420, https://doi.org/10.5194/ems2022-420, 2022.

16:45–17:00
|
EMS2022-548
|
Onsite presentation
Christoph Knigge, Ole Kouker, Daniel Koser, Björn-Rüdiger Beckmann, Dirk Zinkhan, Hermin Beumer-Aftahi, Benedikt Müller, Felix Garcia Funk, Alexandra Melzer, Iris Breitruck, Martin Gehmayr, Matthias Beckmann, Stefan Seitz, Niklas Jost, Helen Estrella, and Johannes Knöferle

Met4Airports is a research and development project funded by the German Federal Ministry for Digital and Transport (BMDV), aiming at the prediction of relevant planning and control parameters of air traffic management (ATM) by means of artificial intelligence (AI). It focusses on the effects of selected weather phenomena such as thunderstorms, significant wind events, fog, and winter weather events like snowfall, as they pose a significant disturbance for air traffic, causing capacity constraints for airports and en-route and approach sectors. The predicted quantities are mainly capacity values, delays of individual flights as well as average delay values for varying timespans with forecast lead times of up to 24 hours. Predictions of sufficient forecast qualities could be utilized to optimize decision-making processes in ATM and enhance the situational awareness of decision makers.

Throughout the development process, various machine learning models are examined, relying on both meteorological forecast products of Deutscher Wetterdienst and air traffic data of airport operators (Flughafen München GmbH and Fraport) and air traffic control (Deutsche Flugsicherung). Presently, the applied meteorological data include Terminal Aerodrome Forecasts (TAF) and NowCastMix-Aviation for short-term thunderstorm prediction. The integration of additional meteorological data types, such as forecasts from the numerical weather forecasting system ICON-D2 is currently in progress. The applied air traffic data comprises flight lists, including estimated and target times from the Airport Collaborative Decision Making (A-CDM) process with an archiving period starting from 2016 as well as their historic updates. In the iterative development process, artificial neural networks with varying topologies and hyperparameters are trained on different combinations of data types in order to identify a suitable AI-model. In addition, other machine learning techniques, including gradient boosting shall be examined in the upcoming months.

A preliminary study revealed a significant enhancement of AI-model performance in predicting flight delays by integration of meteorological data. Thereafter, applicable forecasting parameters were identified by means of correlation analyses. Interim results indicate the capability of preliminary models to surpass the prediction quality of estimation timestamps (EOBT) currently used in the A-CDM process. Ultimately, delay forecasting results tend to be considerably more precise for the prediction of average values than for single flight delays, as the corresponding results are far less sensitive to short-term effects affecting individual flight operations.

How to cite: Knigge, C., Kouker, O., Koser, D., Beckmann, B.-R., Zinkhan, D., Beumer-Aftahi, H., Müller, B., Garcia Funk, F., Melzer, A., Breitruck, I., Gehmayr, M., Beckmann, M., Seitz, S., Jost, N., Estrella, H., and Knöferle, J.: Met4Airports - Prediction of weather-induced operating restrictions at German international airports by means of artificial intelligence, EMS Annual Meeting 2022, Bonn, Germany, 5–9 Sep 2022, EMS2022-548, https://doi.org/10.5194/ems2022-548, 2022.

17:00–17:15
|
EMS2022-570
|
Presentation form not yet defined
Machine learning in a probabilistic framework can improve the prediction of lightning ignited fires
(withdrawn)
Francesca Di Giuseppe
Display time: Wed, 7 Sep, 08:00–Wed, 7 Sep, 18:00

Posters: Wed, 7 Sep, 11:00–13:00 | b-IT poster area

Chairperson: Bernhard Reichert
P5
|
EMS2022-350
|
Onsite presentation
|
Sojung An, Tae-Jin Oh, Inchae Na, Jiyeon Jang, Wooyeon Park, Sang-Wook Kim, Ilseok Noh, and Junghan Kim

In order to capture spatio-temporal characteristics of precipitation process in machine learning context, many studies applied convolutional and recurrent neural networks. Many state-of-the-art approaches focused on learning a single latent representation of the quantitative precipitation forecast (QPF). To describe reflectivity echoes with a single latent variable may be an overly restrictive assumption, impeding effective learning of the precipitation features. Therefore, we propose a conditioned forecasting model based on self-supervised learning (SSL), that generalizes diverse precipitation types which would enable various latent representations. Our method trains each latent variables according to a condition that is approximated with a generative adversarial network (GAN). Specifically, the model is pre-trained by the same condition for retaining consistency of latent space while training the generator features. The feature matrix of the generator is clustered every 100 epoch based on Principal Component Analysis (PCA) and k-means clustering. Korean summer (JJA) precipitation with 4km resolution from 2012 to 2021 is used as the dataset which is converted from radar reflectivity of Constant Altitude Plan Position Indicator (CAPPI) to rainfall intensity. A sample consists of 18 time series of 10 minute intervals. The dataset is splitted into 2012 to 2020 as training set and 2021 as test set. Overall, it contains 9,048 sequences for training and 729 sequences for testing. Results show that our method improved generalization of precipitation features as it showed comparable or better performance compared to previous studies in terms of critical success index (CSI) score up to 2 hours prediction. Our SSL method can train useful representations from unlabeled precipitation data and effectively predicts complicated echo patterns. We also found that training GANs by clustering the generator features more than sixteen condition types is much easier to solve mode collapse where many GAN models suffer from.

How to cite: An, S., Oh, T.-J., Na, I., Jang, J., Park, W., Kim, S.-W., Noh, I., and Kim, J.: Conditioned Forecasting Model based on Self-Supervised Learning, EMS Annual Meeting 2022, Bonn, Germany, 5–9 Sep 2022, EMS2022-350, https://doi.org/10.5194/ems2022-350, 2022.

P6
|
EMS2022-413
|
Onsite presentation
|
Jouke de Baar, Cees de Valk, and Gerard van der Schrier

In various machine learning (ML) approaches, we are moving towards reliable quantification of uncertainty of results. More and more, it is becoming clear that without quantified uncertainties it is difficult to compare ML-based results or predictions. In addition, users of ML results are becoming more and more inclined to consider uncertainties in the decision-making process. Apart from providing results, it is becoming important to provide a quantified statement of accuracy – in fact, one might argue that a ML result without quantified uncertainty is actually an incomplete result. Therefore, an important generic question is: how do we ensure that the reported uncertainties are indeed reliable? 

Presently, we approach this question by developing and applying ensemble dispersion improvement auto-tune (EDIT) for spatial regression. Where cross-validation is often used to check the quality of an ensemble prediction a posteriori, for example by constructing rank histograms, in EDIT we use cross-validation to optimize the ensemble spread. This correction is made as a function of EDIT proxies, that is, covariates that might be important indicators of the magnitude of the correction we should make. In our case, we use a multi-objective optimization, which targets both the flatness of the regression rank histogram and the accurate dependency of the regression uncertainty on proxies. 

In this work, we consider the example of spatial regression of in situ observations of daily mean wind speed in Europe for the period 1980 – 2021 as part of the E-OBS data set. In such products, we provide maps of historical wind speed, and communicate the uncertainty in our results by providing an ensemble of maps. Important EDIT proxies for ensemble dispersion correction are the distance to the nearest station and complexity of terrain. Using EDIT, we see a significant improvement of the reliability of the ensemble dispersion. 

How to cite: de Baar, J., de Valk, C., and van der Schrier, G.: Ensemble dispersion improvement auto-tune (EDIT), a generic post processing step for Machine Learning regression results with quantified uncertainty, EMS Annual Meeting 2022, Bonn, Germany, 5–9 Sep 2022, EMS2022-413, https://doi.org/10.5194/ems2022-413, 2022.

P7
|
EMS2022-352
|
Onsite presentation
|
Clément Bouvier, Joona Cornér, and Victoria Sinclair

A majority of insured losses over Europe are related to Extra-Tropical Cyclones (ETC) which are characterised by strong winds, heavy precipitation and powerful ocean waves. Baroclinic wave simulations (BWS) are used to study ETC by varying their background state and measuring their different intensities. However, two main issues limit an exhaustive exploration of ETC intensity and background state relationship: 1) the dimensionality of the feature space, 2) a large number of intensity measures. To alleviate this issue, this study proposes to use a wrapper Feature Selection Algorithm (wFSA) combined with a non-linear regressor applied to an intensity measure. The selected subsets are analysed through time.

BWS was performed in the moist case using OpenIFS version Cy43r3v2 configured as an aqua planet with full physics and the radiation scheme deactivated. The atmospheric state proposed by Jablonowski and Williamson was used. The spatial resolution of the simulation was set to TL319/L137 and the time resolution to 20 minutes for 15 days. The initial perturbation was located in 40°N 20°E. A number of 55 measures -called features- were extracted from the BWS and the 10-m wind gust was selected as the intensity measure. A stable wFSA was performed using weighted Random Forest Regressor in the framework proposed by Meinshausen and Bühlmann. The regression was run 10 times on 60% of randomly selected points in the northern hemisphere to infer the 10-m wind gust. Finally, the average feature importance and its variance were computed for each feature every 12 hours.

The forecast surface roughness and the specific humidity were the most important features for the first 2 days. Afterwards, mean sea level pressure became predominant for 5 days. For the remaining days, forecast surface roughness, specific humidity and large scale precipitation were the most important features to infer 10-m wind gust. Further work will aim at increasing the number of BWS by modifying the average temperature of the background state. All results will be compared to propose an efficient dimension reduction to study BWS and their evolution.

How to cite: Bouvier, C., Cornér, J., and Sinclair, V.: Temporal evolution of features that control 10-m wind gusts in moist baroclinic wave simulations identified using non-linear regression, EMS Annual Meeting 2022, Bonn, Germany, 5–9 Sep 2022, EMS2022-352, https://doi.org/10.5194/ems2022-352, 2022.

P8
|
EMS2022-80
|
Onsite presentation
Gregor Skok, Doruntina Hoxha, and Žiga Zaplotnik

This study investigates the potential of direct prediction of daily extremes (maximum and minimum) of 2-m temperature from a radiosonde measurement using neural networks (NNs). The analysis is based on 3800 daily profiles measured in the period 2004-2019. Various setups of dense sequential NNs are trained to predict the daily extremes at different lead times. The purpose of the analysis was not to develop a model that would be better than operational numerical weather prediction models but primarily to investigate the capabilities of neural networks for such forecasts. Specifically, our goal was to understand how neural network-based models use different types of input data and how network design and its complexity affect their behavior. The data utilization and behavior of the network depend on whether the NNs are used to do short-term or long-term forecasts - this is why the analysis was performed for a wide range of forecast lead times ranging from 0 to 500 days into the future. The analysis of very simple NNs, consisting of only a few neurons, used to predict same-day extremes highlighted how the nonlinear behavior of the NN increases with the number of neurons. It also showed how different training realizations of the same network could result in different behaviors of the NN. The behavior in the part of the predictor phase space with the highest density of training cases was usually quite similar for all training realizations, while the behavior elsewhere was more variable and more frequently exhibited unusual nonlinearities. We also analyzed more complex NN setups that were used for the short- and long-term forecasts of temperature extremes. Besides the profile measurements, some setups used additional predictors such as the previous-day measurements and climatological values of extremes. The behavior of the setups was also analyzed via two XAI methods, which help determine which input parameters have a more significant influence on the forecasted value.

How to cite: Skok, G., Hoxha, D., and Zaplotnik, Ž.: Forecasting the Daily Temperature Extremes from Radiosonde Measurements Using Neural Networks, EMS Annual Meeting 2022, Bonn, Germany, 5–9 Sep 2022, EMS2022-80, https://doi.org/10.5194/ems2022-80, 2022.

P9
|
EMS2022-593
|
Online presentation
|
Santiago Gaztelumendi, Pedro Liria, Aurelio Diaz de Arcaya, Raul Ruiz, Joseba Egaña, Javier Moreno, Irati Epelde, Artzai Picon, Jone Echazarrra, and Jose Antonio Aranda

The analyses of coastal environments using video images have proven to be an efficient methodology. In the Basque case, the so-called “KOSTASystem”, developed by AZTI in collaboration with different institutional agents, allows monitoring different aspects of coastal dynamics. This coastal videometry system consists of cameras installed on the coast that allow images to be captured and spatially referenced. Such systems have been used worldwide to monitor the morphological configuration of beaches, dunes, channels and bars, but in recent years also to obtain reliable information on sea conditions and in the characterization of waves on beaches, among others applications.

Here we present a local application for impact weather monitorization, particularly in the field of “maritime-coastal risk: impact on the coast” as defined by the operational Basque severe weather protocol. Using KOSTASystem video imagery, images processing and computer vision algorithms, we have developed an automatic real time wave run-up impact monitoring system in some representative and selected locations on Basque shoreline.  

In this paper, we present main technical characteristics of the system and its operational implementation in the Basque Meteorology Agency (Euskalmet) for analysis, surveillance and monitoring purposes during and after coastal-maritime risk episodes. We also summarize our experience and some conclusions from the development and operational perspective.

The KOSTASystem videometry system covers different points of the Basque coast including ports and promenades areas. For instance, in Zarautz case is possible to characterize flood situations due to waves on the promenade or in Bermeo case waves impact in the port area. The sequences of images captured by the cameras are used for the generation of instantaneous images and integrated in time (timestack) by accumulating over a certain period of time rows of pixels from a defined transect on the image, previously converted from oblique to a plane metric image. As long as these transects are properly determined, they will reflect the temporal evolution of the wave run-up and the flooding or overtopping of the body of water in the predefined dry area for different positions of the image. Later on, automatic identification of run-up, partial and total overtopping is made, applying to different images some computer vision algorithms like the Otsu`s optimal thresholding and Radon Transform methodologies.

In this contribution, we present not only details of the implementation of automatic recognition systems using computer vision techniques; we also present other relevant aspects for the operational implementation. We summarize key characteristics for the downloading, archiving and integration of different images and products in the automatic monitoring and surveillance systems and their inclusion in the operational intranet in Euskalmet/Tecnalia. Finally, we present some conclusions focusing on usability for different operational tasks like analysis, surveillance, prediction and validation.

How to cite: Gaztelumendi, S., Liria, P., Diaz de Arcaya, A., Ruiz, R., Egaña, J., Moreno, J., Epelde, I., Picon, A., Echazarrra, J., and Aranda, J. A.: Videometry applied to impact weather characterization: coastal risk in Basque Country., EMS Annual Meeting 2022, Bonn, Germany, 5–9 Sep 2022, EMS2022-593, https://doi.org/10.5194/ems2022-593, 2022.

P10
|
EMS2022-317
|
Onsite presentation
|
Kevin Horan and Conor Lally

A key component of a weather service’s mission is to capture accurate observations of weather conditions in real-time, which are subsequently fed into forecasting models for prediction, used in climate research, and form the basis of the official climate statistics. The accuracy of such measurements is ensured by following a set of robust procedures set out in the World Meteorological Organisation guidelines. In addition to appropriate maintenance and calibration of equipment, procedures for quality control (QC) of observations are necessary in order for users to have maximum confidence in the data. Much of this QC tends to be done manually by expert climatologists who examine data and flag erroneous values. However, due to the current proliferation of climate data, more and more organisations are seeking to implement automated, real-time QC to prevent poor quality data being used in operational products. One potential approach is to use Anomaly Detection algorithms to identify potentially erroneous values. Such methods are a well-established field in Machine Learning and are a potential tool for QC of environmental data.

This paper develops and evaluates models to perform automatic QC of Irish weather data (from the Irish meteorological agency Met Éireann) using appropriate Machine Learning techniques such as Convolutional Neural Networks (CNN’s). The outcomes are evaluated in comparison to standard statistical tests which have traditionally been used for these purposes (such as linear regression spatial variability tests). The dataset in question consists of over 10 years of automated temperature observations taken at one minute intervals using Platinum Resistance Thermometers (PRT’s) located at sites across the country. This is complemented by a related data set that has been manually quality-controlled using well established methods, which can be used for comparison and verification.

The paper aims to investigate whether anomaly detection algorithms can correctly identify erroneous values in the time-series to at least the same standard as that achieved through traditional manual approaches. Furthermore, it examines whether these anomalies, once discovered, can be classified into categories with interpretable physical meaning.

How to cite: Horan, K. and Lally, C.: Quality control of weather observations using machine learning, EMS Annual Meeting 2022, Bonn, Germany, 5–9 Sep 2022, EMS2022-317, https://doi.org/10.5194/ems2022-317, 2022.

P11
|
EMS2022-522
|
Onsite presentation
|
Shalenys Bedoya-Valestt, Pablo Rozas-Larraondo, Cesar Azorin-Molina, Carlo Cafaro, Luis Gimeno, Lorenzo Minola, Jose A. Guijarro, Robert Dunn, Enric Aguilar, and Manola Brunet

Sea breezes can occur on any coast of the world affecting meteorological variables broadly used to detect sea breeze episodes. To date, there is no universal method to identify sea breezes that works all over the globe. Most existing methodologies develop their own selection methods based on thresholds related to the local characteristics of the study site: e.g. wind direction based on the coastline orientation, cloud cover, precipitation, humidity, land-sea air temperature difference, pressure amplitude, insolation, among others. However, detecting past episodes from well-defined criteria makes sea breeze identification dependent on these criteria. This makes most classified events on the same site differ from each other, as well as making extrapolation to other regions difficult. The scarcity of high-resolution observed historical data over land and sea surfaces has limited the sea breeze understanding in many coastal regions across the world. To address the need of developing a universal method applicable to any coastal region of the globe, here we explore deep learning techniques (e.g. deep convolutional neural networks). We train these models using a time-series of high-resolution physical reanalysis (e.g. ERA-5 Land) gridded and observed data, after identifying 5 years of sea breezes manually for random stations around the globe. Results from this study constitute the first historical sea breeze global database spanning approximately the last 40 years. The ability of machine learning models to detect sea breezes allows the development of a simple and universal approach, which will improve the understanding of sea breeze at spatial scales which has not been addressed before, as far as we know.

How to cite: Bedoya-Valestt, S., Rozas-Larraondo, P., Azorin-Molina, C., Cafaro, C., Gimeno, L., Minola, L., Guijarro, J. A., Dunn, R., Aguilar, E., and Brunet, M.: A deep learning approach for identifying coastal sea breezes globally, EMS Annual Meeting 2022, Bonn, Germany, 5–9 Sep 2022, EMS2022-522, https://doi.org/10.5194/ems2022-522, 2022.

Supporters & sponsors