EMS Annual Meeting Abstracts
Vol. 22, EMS2025-47, 2025, updated on 30 Jun 2025
https://doi.org/10.5194/ems2025-47
EMS Annual Meeting 2025
© Author(s) 2025. This work is distributed under
the Creative Commons Attribution 4.0 License.
Fully Data Driven “Multi-Model” Ensemble
Jian Tang1, Yuejian Zhu2, and Kan Dai1
Jian Tang et al.
  • 1National Meteorological Center, China Meteorolofical Administration, Beijing, China
  • 2Earth System Modeling and Prediction Center, China Meteorolofical Administration, Beijing, China (yuejian.zhu@hotmail.com)

In recent years, data-driven weather models have advanced rapidly, leveraging deep learning and vast reanalysis datasets to generate high-resolution forecasts with remarkable speed. Models such as Pangu-Weather, GraphCast and FourCastNet have demonstrated competitive or even superior accuracy compared to traditional numerical weather prediction (NWP) models. These AI-based models eliminate the need for explicit physical equations, relying instead on learned patterns from historical data. However, challenges remain, including limited physical interpretability, difficulty in handling extreme events, and the need for robust uncertainty quantification. Despite these challenges, the rapid progress of data-driven weather models suggests they will play an increasingly important role in future forecasting systems, potentially complementing or even transforming traditional NWP methods.

Pangu-Weather comprises four AI-driven models, each optimized for different forecast lead times (1-hour, 3-hour, 6-hour, and 24-hour). These models exhibit distinct error growth rates and capture different weather phenomena, ranging from rapidly evolving atmospheric features to large-scale global weather patterns. To leverage these characteristics, this study explores a stochastic combination approach for generating both initial perturbations and model perturbations:

  • Initial perturbations are created by randomly combining differences between the 48-hour and 24-hour forecasts from the 1-hour, 3-hour, 6-hour and 24-hour models to form 15 pairs (30 members).
  • Model perturbations are introduced through the stochastic combination of forecasts from the 3-hour, 6-hour and 24-hour models from 0-120 hours.

This multi-model ensemble strategy with 30 members enhances the representation of forecast uncertainties for extratropical areas of synoptic scale phenomena by incorporating the strengths of each model while mitigating their individual weaknesses. As a result, it provides a more robust and probabilistic forecasting framework with limited costs, improving forecast skills for extended range of atmospheric prediction. However, the data driven model is biased which could reduce forecast reliability; and it is still challenged for tropical uncertainty representation due to limited capability to assimilate small scale error growth.

How to cite: Tang, J., Zhu, Y., and Dai, K.: Fully Data Driven “Multi-Model” Ensemble, EMS Annual Meeting 2025, Ljubljana, Slovenia, 7–12 Sep 2025, EMS2025-47, https://doi.org/10.5194/ems2025-47, 2025.

Recorded presentation

Show EMS2025-47 recording (13min) recording