- 1Deutsches Zentrum für Luft- und Raumfahrt (DLR), Institut für Physik der Atmosphäre, Oberpfaffenhofen, Germany
- 2School of Computation, Information and Technology, Technical University of Munich, Garching, Germany
- 3Institute of Environmental Physics (IUP), University of Bremen, Bremen, Germany
Climate models typically operate at coarse spatial resolution (~100 km) due to computational constraints, yet many climate-change impact assessments require fine-scale information (<10 km). In this study, we systematically benchmark three state-of-the-art machine-learning approaches for statistical downscaling, using the storm-resolving ICON NextGEMS dataset as reference. All methods take coarse-resolution climate fields as input and generate physically plausible high-resolution predictions. We compare: (1) UNet, a deterministic encoder–decoder architecture; (2) CorrDiff, which augments the UNet backbone with a diffusion model to produce probabilistic ensembles; and (3) CorrDiff++, which replaces diffusion with flow-matching to improve sampling efficiency. We perform 10× downscaling (0.56° to 0.056°) over central Europe for six surface variables, including temperature, wind, and precipitation. The models are evaluated along multiple dimensions: deterministic accuracy (bias, correlation), probabilistic skill (ensemble reliability and sharpness), and physical realism (energy spectra, temporal coherence, representation of extremes). Our results highlight fundamental trade-offs between computational cost, physical consistency, and uncertainty quantification. These insights provide guidance on when the additional complexity of generative models is justified for climate science applications.
How to cite: Debeire, K., Eyring, V., and Thuerey, N.: Benchmarking Deterministic and Generative Machine Learning Models for Statistical Climate Downscaling over Europe, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-12407, https://doi.org/10.5194/egusphere-egu26-12407, 2026.