EGU25-6937, updated on 14 Mar 2025
https://doi.org/10.5194/egusphere-egu25-6937
EGU General Assembly 2025
© Author(s) 2025. This work is distributed under
the Creative Commons Attribution 4.0 License.
Poster | Tuesday, 29 Apr, 08:30–10:15 (CEST), Display time Tuesday, 29 Apr, 08:30–12:30
 
Hall X5, X5.14
Lead time-dependent postprocessing of 2-meter temperature forecast using a multivariate generative machine learning model 
Sameer Balaji Uttarwar1, Jieyu Chen2, Sebastian Lerch2, and Bruno Majone1
Sameer Balaji Uttarwar et al.
  • 1Department of Civil, Environmental and Mechanical Engineering, University of Trento, Italy (sameer.uttarwar@unitn.it)
  • 2Institute of Statistics, karlsruhe Institute of Technology, Karlsruhe, Germany

The spatiotemporal dependence structure in postprocessed weather forecast variables is essential for reliable hydrological and socio-economic applications. However, in univariate postprocessing, where statistical or advanced machine learning techniques are applied independently in each margin, the multivariate dependence structure present in the raw ensemble forecasts is lost. To restore the disrupted spatial or temporal dependence structure of univariately postprocessed forecasts, copula-based methods are traditionally applied as an additional step that utilizes dependency information from raw ensemble forecasts or historical observations. However, such a two-step framework faces difficulty incorporating exogenous variables to model the dependence structure. To overcome these limitations, a multivariate non-parametric data-driven distributional regression postprocessing technique based on a generative neural network is employed to draw samples directly from multivariate predictive distribution as output [1]. This study focuses on preserving temporal dependency and investigates the performance of a multivariate generative model against two-step approaches to postprocess a 2-meter temperature forecast with a one-month lead time over the Trentino-South Tyrol region in the northeastern Italian Alps. The forecast dataset is a fifth-generation seasonal weather forecast system (SEAS5) generated by the European Centre for Medium-Range Weather Forecasts (ECMWF), which has a 0.125° x 0.125° horizontal grid resolution with 25 ensemble members over a reforecast period from 1981 to 2016. The reference dataset is the high-resolution (250 m x 250 m) gridded observational data over the region. The results are presented using multivariate proper scoring rules (i.e., energy and variogram scores) to measure the overall discrepancy and dependence structure in the postprocessed forecast. The performance analysis reveals that the multivariate generative postprocessing model outperforms the two-step approach over the entire region.

 

References:

[1] Chen, J., Janke, T., Steinke, F. & Lerch, S. Generative Machine Learning Methods for Multivariate Ensemble Postprocessing. Ann. Appl. Stat. 18, 159–183 (2024).

How to cite: Uttarwar, S. B., Chen, J., Lerch, S., and Majone, B.: Lead time-dependent postprocessing of 2-meter temperature forecast using a multivariate generative machine learning model , EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-6937, https://doi.org/10.5194/egusphere-egu25-6937, 2025.