4-9 September 2022, Bonn, Germany
EMS Annual Meeting Abstracts
Vol. 19, EMS2022-350, 2022, updated on 28 Jun 2022
https://doi.org/10.5194/ems2022-350
EMS Annual Meeting 2022
© Author(s) 2022. This work is distributed under
the Creative Commons Attribution 4.0 License.

Conditioned Forecasting Model based on Self-Supervised Learning

Sojung An1, Tae-Jin Oh1, Inchae Na1, Jiyeon Jang1, Wooyeon Park1, Sang-Wook Kim2, Ilseok Noh2, and Junghan Kim3
Sojung An et al.
  • 1Korea Institute of Atmosphere Systems, Data Application Team, Korea, Republic of (sojungan@kiaps.org)
  • 2Korea Institute of Atmospheric Prediction Systems, Model System Team, Korea, Republic of
  • 3Korea Institute of Atmospheric Prediction Systems, Operational System Group, Korea, Republic of

In order to capture spatio-temporal characteristics of precipitation process in machine learning context, many studies applied convolutional and recurrent neural networks. Many state-of-the-art approaches focused on learning a single latent representation of the quantitative precipitation forecast (QPF). To describe reflectivity echoes with a single latent variable may be an overly restrictive assumption, impeding effective learning of the precipitation features. Therefore, we propose a conditioned forecasting model based on self-supervised learning (SSL), that generalizes diverse precipitation types which would enable various latent representations. Our method trains each latent variables according to a condition that is approximated with a generative adversarial network (GAN). Specifically, the model is pre-trained by the same condition for retaining consistency of latent space while training the generator features. The feature matrix of the generator is clustered every 100 epoch based on Principal Component Analysis (PCA) and k-means clustering. Korean summer (JJA) precipitation with 4km resolution from 2012 to 2021 is used as the dataset which is converted from radar reflectivity of Constant Altitude Plan Position Indicator (CAPPI) to rainfall intensity. A sample consists of 18 time series of 10 minute intervals. The dataset is splitted into 2012 to 2020 as training set and 2021 as test set. Overall, it contains 9,048 sequences for training and 729 sequences for testing. Results show that our method improved generalization of precipitation features as it showed comparable or better performance compared to previous studies in terms of critical success index (CSI) score up to 2 hours prediction. Our SSL method can train useful representations from unlabeled precipitation data and effectively predicts complicated echo patterns. We also found that training GANs by clustering the generator features more than sixteen condition types is much easier to solve mode collapse where many GAN models suffer from.

How to cite: An, S., Oh, T.-J., Na, I., Jang, J., Park, W., Kim, S.-W., Noh, I., and Kim, J.: Conditioned Forecasting Model based on Self-Supervised Learning, EMS Annual Meeting 2022, Bonn, Germany, 5–9 Sep 2022, EMS2022-350, https://doi.org/10.5194/ems2022-350, 2022.

Displays

Display file

Supporters & sponsors