ESSI1.6 | Machine Learning and Digital Twins for Earth System Observation and Prediction
Orals |
Mon, 14:00
Mon, 10:45
Tue, 14:00
EDI
Machine Learning and Digital Twins for Earth System Observation and Prediction
Co-organized by AS5
Convener: Patrick EbelECSECS | Co-conveners: Danaele Puechmaille, Christian Lessig, Rochelle SchneiderECSECS, Ilaria Luise, Claudia Vitolo, Massimo Bonavita
Orals
| Mon, 28 Apr, 14:00–17:55 (CEST)
 
Room -2.92
Posters on site
| Attendance Mon, 28 Apr, 10:45–12:30 (CEST) | Display Mon, 28 Apr, 08:30–12:30
 
Hall X4
Posters virtual
| Attendance Tue, 29 Apr, 14:00–15:45 (CEST) | Display Tue, 29 Apr, 08:30–18:00
 
vPoster spot 4
Orals |
Mon, 14:00
Mon, 10:45
Tue, 14:00

Orals: Mon, 28 Apr | Room -2.92

The oral presentations are given in a hybrid format supported by a Zoom meeting featuring on-site and virtual presentations. The button to access the Zoom meeting appears just before the time block starts.
14:00–14:05
14:05–14:15
|
EGU25-15912
|
On-site presentation
Tim Reichelt, Juniper Tyree, Milan Kloewer, Peter Dueben, Bryan Lawrence, Dorit Hammerling, Alisson Baker, Sara Faghih-Naini, and Philip Stier

The rapid growth of weather and climate datasets is increasing the pressure on data centres and hinders scientific analysis and data distribution. For example, kilometre-scale weather and climate models can generate 20 gigabytes of data per second when run operationally, making it generally infeasible to store all output unless advanced compression is applied. 

To address this challenge, novel lossy compression techniques, including recently so-called neural compressors which learn smaller representations of climate data, have been proposed with compression factors beyond 100x. However, if applied without care, lossy compression can remove valuable information from a dataset for often unknown downstream applications. It is therefore important to validate that the compression process does not alter scientific conclusions drawn from the data. Whether the compression error is tolerable is often easier to assess for domain experts and rarely well defined. 

Here, we address this challenge by presenting a benchmark suite for lossy compression of climate data (atmosphere, ocean, and land). We are defining data sets that can be used to train neural compressors as well as corresponding evaluation methods. Compressors have to pass a set of tests for each data set while compressing into the smallest file size possible at a reasonable (de)compression speed. To ensure evaluation on a diverse set of inputs, the benchmark covers climate variables following various statistical distributions at medium to very high resolution in time (hourly to yearly) and space (~1 km to 150 km). Evaluation tests are for single and multi-variable compression of gridded data with stable or changing statistics, random data access or large archives, in medium to very large datasets.

To provide references towards what compression levels can be achieved with current state of the art lossy compressors, we also evaluate a set of baseline compressors (SZ3, ZFP, Real Information) on our benchmark tasks. The benchmark is a quality check for new compressors towards a standardization of climate data compression, aiming to make compressors with high compression factors safe to use and widely supported.

How to cite: Reichelt, T., Tyree, J., Kloewer, M., Dueben, P., Lawrence, B., Hammerling, D., Baker, A., Faghih-Naini, S., and Stier, P.: ClimateBenchPress: A Benchmark for Compression of Climate Data, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-15912, https://doi.org/10.5194/egusphere-egu25-15912, 2025.

14:15–14:25
|
EGU25-742
|
ECS
|
On-site presentation
Thiago Rocha

Urban heat islands (UHI) exacerbate health and environmental challenges, disproportionately affecting vulnerable populations. This study identifies high-risk areas for UHI effects in the Americas, including their metropolitan regions, using a suitability analysis model. It highlights the interplay between urban expansion, social vulnerability, and climate stress, emphasizing the urgency of addressing these issues in rapidly urbanizing contexts.

High-resolution satellite imagery and geospatial data were used to build the model. Key criteria included population density (dasymetric layers from WorldPop), relative wealth index, land surface temperature (LST) from MODIS, land cover from MODIS, PM2.5 and NO2 concentrations (Sentinel-5P), and road network layers derived for the analysis. Each criterion was reclassified, transformed to a common scale, and weighted equally to ensure consistency and comparability.

The suitability index was generated using raster algebra (weighted sum), producing a continuous map where higher values indicate greater susceptibility to heat stress and lower socioeconomic status. The analysis revealed spatial patterns that highlight areas with high potential impacts due to UHI characteristics.

The suitability index serves as a tool for identifying priority areas for targeted interventions and climate mitigation actions. This integrative approach highlights the need for sustainable urban development policies that reduce socio-environmental disparities and promote resilience in vulnerable communities

How to cite: Rocha, T.: Identifying Urban Heat Island Risk Areas with Vulnerable Populations: A Suitability Analysis Approach to support Health Interventions in the Americas, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-742, https://doi.org/10.5194/egusphere-egu25-742, 2025.

14:25–14:35
|
EGU25-11883
|
ECS
|
On-site presentation
|
Rachel Furner, Rilwan Adewoyin, Mario Santa Cruz, Sara Hahner, Sarah Keeley, Kristian Mogensen, and Lorenzo Zampieri

Machine learning (ML) techniques have emerged as a powerful tool for predicting weather and climate systems, particularly in predicting the short-term evolution of the atmosphere. Here, we look at the potential for ML to predict the evolution of the 3d-ocean. 

We present a data-driven global ocean model, developed within the Destination Earth project, to form the ocean component of a fully data-driven earth system model. Following the skill shown by the AIFS (Lang et al, 2024), we use a graph-based encoder-decoder design, with a transformer backbone. Our model is trained on the ECMWF ORAS6 reanalysis dataset (Zuo et al, 2024). Work focuses on short-term predictions, up to a 2-week forecast period. The model predicts temperature, salinity, zonal and meridional current components throughout the full ocean depth, along with sea-surface height and sea-ice. 

In this presentation we will discuss the design choices of our network architecture, including comparisons between networks trained to predict future fields, and those trained to predict increments to fields. We will show results from our data-driven model and put these into the context of other similar models. 

Simon Lang, Mihai Alexe, Matthew Chantry, Jesper Dramsch, Florian Pinault, Baudouin Raoult, Mariana C. A. Clare, Christian Lessig, Michael Maier-Gerber, Linus Magnusson, Zied Ben Bouallègue, Ana Prieto Nemesio, Peter D. Dueben, Andrew Brown, Florian Pappenberger, and Florence Rabier (2024). AIFS – ECMWF’s data-driven forecasting system. arXiv preprint https://arxiv.org/abs/2406.01465.  

Hao Zuo, Magdalena Alonso-Balmaseda, Eric de Boisseson, Philip Browne, Marcin Chrust, Sarah Keeley, Kristian Mogensen, Charles Pelletier, Patricia de Rosnay, Toshinari Takakura (2024). ECMWF’s next ensemble reanalysis system for ocean and sea ice: ORAS6. ECMWF newsletter. https://doi.org/10.21957/hzd5y821lk  

How to cite: Furner, R., Adewoyin, R., Santa Cruz, M., Hahner, S., Keeley, S., Mogensen, K., and Zampieri, L.: Developing a data-driven global ocean model at ECMWF , EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-11883, https://doi.org/10.5194/egusphere-egu25-11883, 2025.

14:35–14:45
|
EGU25-17080
|
ECS
|
On-site presentation
Matthias Karlbauer, Florian M. Hellwig, Thomas Jagdhuber, and Martin V. Butz

With the increasing availability and demand of remote sensing data from Earth observation satellites, the accuracy of weather prediction models can be improved substantially. Satellite products, such as Land-Surface Temperature (LST), however, suffer from missing data, either caused by clouds that cover the ground, by missing spatial coverage of the mission, or by outages of the sensors. Such spatial data gaps in LST products impose strict limitations when aiming to process the data further with, e.g., numerical weather prediction models assuming spatial continuity with gapless input data. We therefore propose a gap-filling algorithm based on a masked autoencoder that only receives a small percentage from a 32x32 LST snapshot and learns to reconstruct the missing patches. We use the spatial domain defined by the Land Atmosphere Feedback Initiative (LAFI) over central Europe and operate on geostationary LST data from the Copernicus Global Land Service in June 2023 at 5 km resolution. Our approach indicates considerable potential when filling spatial gaps in LST products, however, we emphasize one critical aspect. The LST estimates below clouds cannot be expected to be realistic and would require a sophisticated atmospheric correction. To mitigate this limitation, we aim to incorporate microwave data in future that penetrates clouds and therefore could help to estimate LST below clouds. In its current formulation, our algorithm can be used to fill gaps in LST products as if there were no clouds. We will show the potential and limitations of the autoencoder-based gap-filling algorithm for several showcases across Europe. 

How to cite: Karlbauer, M., Hellwig, F. M., Jagdhuber, T., and Butz, M. V.: Spatial Gap Filling in a Geostationary Land-Surface Temperature Product with a Masked Autoencoder, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-17080, https://doi.org/10.5194/egusphere-egu25-17080, 2025.

14:45–14:55
|
EGU25-11988
|
On-site presentation
Marcin Chrust, Alban Farchi, Massimo Bonavita, Marc Bocquet, and Patrick Laloyaux

Systematic model errors significantly limit the predictability horizon and practical utility of the current state-of-the-art forecasting systems. Even though accounting for these systematic model errors is increasingly viewed as a fundamental challenge in the field of numerical weather prediction, estimation and correction of the predictable component of the model error has received relatively little attention. Modern implementations of weak-constraint 4D-Var are an exception here and a promising avenue within the variational data assimilation framework, showing encouraging results. Weak-constraint 4D-Var can be viewed as an online hybrid data assimilation and machine learning approach which gradually learns about model errors from partial and imperfect observations, allowing to improve the state estimation. We propose a natural extension of this approach by applying deep learning techniques to further develop the concept of online model error estimation and correction.

In this talk, we will present recent progress in developing a hybrid model for the ECMWF Integrated Forecasting System (IFS). This system augments the state-of-the-art physics-based model with a statistical model implemented via a neural network, providing flow dependent model error corrections. While the statistical model can be pre-trained offline, we demonstrate that by extending the 4D-Var control vector to include the parameters of the neural network, i.e. the model of model error, we can further improve its predictive capability. We will discuss the impact of applying the flow dependent model error corrections in the medium range forecasts on the forecast quality.

How to cite: Chrust, M., Farchi, A., Bonavita, M., Bocquet, M., and Laloyaux, P.: Development of an offline and online hybrid model for the Integrated Forecasting System, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-11988, https://doi.org/10.5194/egusphere-egu25-11988, 2025.

14:55–15:05
|
EGU25-17875
|
ECS
|
On-site presentation
Dohyung Kim and Kelsey Doerksen

The Children’s Climate Risk Index (CCRI) was first released in 2021, providing a comprehensive, global view of children’s exposure and vulnerability to the impacts of climate change. The CCRI is a composite index that aims to rank countries where children are exposed to climate and environmental hazards. The CCRI 2.0 builds on the previous index by integrating two pillars; Pillar 1 focusing on climate hazards and Pillar 2 on inherent vulnerabilities to WASH, health, education and other relevant dimensions. 

 

We highlight our contributions to CCRI 2.0, using a cluster methodology for quantifying children’s exposure to climate risks including riverine and coastal flooding, tropical storms, heatwaves, and drought. Using unsupervised learning, we allow for a data-driven approach to provide an interpretation of the ranking of children’s exposure to climate risks on a global scale between countries, as well as at the sub-national and local levels. It complements the previous method of constructing the synthesized index, which involved calculating the simple average of multiple indicators. We further discuss our techniques in tackling the challenges of multisource data processing, analysis, and visualization of geospatial data for user insight.

How to cite: Kim, D. and Doerksen, K.: Quantifying Children’s Exposure to Climate Risks using Unsupervised Learning with Multi-Source Geospatial Datasets, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-17875, https://doi.org/10.5194/egusphere-egu25-17875, 2025.

15:05–15:15
|
EGU25-4499
|
ECS
|
On-site presentation
Lorenzo Zampieri, Harrison Cook, Rachel Furner, Sara Hahner, Florian Pinault, Baudouin Raoult, Nina Raoult, Mario Santa Cruz, and Matthew Chantry

Machine learning models have emerged as powerful tools for simulating Earth system processes. Following their successful application in capturing atmospheric evolution for medium-range weather forecasts, attention has increasingly shifted towards other components of the Earth system, such as the marine and land environments. This interest is further driven by the potential to enhance forecasting capabilities beyond the medium range. Machine learning frameworks offer remarkable flexibility in integrating these model components to achieve a coherent Earth system representation. At one end of the spectrum, model components can be trained jointly within a unified framework optimised using a shared loss function. At the other end, components may be developed independently and coupled by exchanging physically relevant information at multiple interfaces, mirroring the traditional coupling strategies employed in numerical models. In this presentation, we will examine the advantages and challenges of these approaches, with a particular emphasis on coupling the atmospheric, land, and marine components within the deterministic AIFS model, the machine learning-based forecast system developed at ECMWF. Furthermore, we will compare the coupling strategies of data-driven models with those of traditional numerical models, highlighting their strengths and limitations.

How to cite: Zampieri, L., Cook, H., Furner, R., Hahner, S., Pinault, F., Raoult, B., Raoult, N., Santa Cruz, M., and Chantry, M.: Coupling approaches for data-driven Earth system models, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-4499, https://doi.org/10.5194/egusphere-egu25-4499, 2025.

15:15–15:25
|
EGU25-5909
|
ECS
|
On-site presentation
Karan Purohit, Mitali Sinha, Aniruddha Panda, Subhasis Banerjee, and Ravi S Nanjundiah

In recent years, medium-range AI weather forecasting models have improved significantly, now offering forecasting accuracy comparable to classical numerical weather prediction (NWP) models, while also being faster and (once trained) less computationally demanding.

Due to inherent assumptions and limitations, all weather prediction models exhibit some degree of persistent systematic errors, also called biases, in their forecast output, with certain models performing better than others for specific variables and regions.

To address these persistent biases, we introduce a machine learning-based multi-model super-ensemble (MMSE), which collectively reduces model biases by combining the complementary strengths of each model. The MMSE assigns optimized weights to each model's forecast based on its historical performance to leverage each model’s strengths under different conditions (both spatial and temporal) rather than equally weighting models as in a simple ensemble mean.

In this work, we developed two regional MMSE models tailored to specific regions, seasons, and variables of interest. One model targets 2-meter air temperature and 10-meter wind components in Germany’s winter season, while the other targets Indian summer monsoon rainfall.

We trained the MMSE using an Extreme Gradient Boosting framework (XGBoost) to capture spatiotemporal features more effectively. The training data consisted of past forecasts from multiple AI models (FourCastNet, Pangu-Weather, GraphCast) and relevant climatology and topology data. ERA5 reanalysis served as the ground truth. The details of MMSE development will be presented.

Our MMSE developed for 2-meter temperature over Germany showed approximately a one-day improvement in forecast gain time compared to the best-performing individual model. In other words, the MMSE’s 11th-day forecast matched the accuracy of the 10th-day forecast from the best-performing model, effectively adding an extra day of reliable lead time. These findings suggest that the proposed MMSE offers a promising, computationally efficient alternative to traditional ensembles for real-time weather forecasting, with potential applications in domains requiring high-precision predictions. With a view to make these results interpretable and to identify the relative strengths of participating models, we will also present the analysis of SHAP values for various variables and regions.

How to cite: Purohit, K., Sinha, M., Panda, A., Banerjee, S., and S Nanjundiah, R.: Enhancing the Skill of Medium Range Forecasts with a Machine Learning Based Multi-Model Super-Ensemble (MMSE), EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-5909, https://doi.org/10.5194/egusphere-egu25-5909, 2025.

15:25–15:35
|
EGU25-8169
|
ECS
|
Highlight
|
On-site presentation
Amal John, Thomas Rackow, Nikolay Koldunov, Sebastian Beyer, Antonio Sanchez Benitez, Helge Gößling, Marylou Athanase, and Thomas Jung

Artificial Intelligence-based Numerical Weather Prediction (AI-NWP) models have recently emerged as powerful tools for weather forecasting, offering computational efficiency and high accuracy. This study explores the extreme weather events simulated by the Artificial Intelligence Forecasting System (AIFS), initialised with conditions derived from kilometer-scale storyline experiments using the IFS-FESOM model where the atmospheric circulation is constrained to observations. We present two case studies: the 2023 South Asian humid heatwave and the 2024 Storm Boris. These two events are reproduced in the present climate, but also simulated if they were to unfold in pre-industrial and +2K future climates, effectively creating AI-driven storylines. The methodology we employ offers a complementary framework, where the use of AI-driven ensembles provides a scalable and rapid way to assess the potential uncertainty and variability associated with such events, by enabling us to explore a broader range of plausible outcomes at very low computational costs. By combining the strengths of physics-based modelling with the efficiency and flexibility of AI-driven simulations, this dual approach offers a pathway to operationalise ensemble-based extreme weather storylines.

How to cite: John, A., Rackow, T., Koldunov, N., Beyer, S., Sanchez Benitez, A., Gößling, H., Athanase, M., and Jung, T.: Exploring AI-Driven Event-based Storylines, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-8169, https://doi.org/10.5194/egusphere-egu25-8169, 2025.

15:35–15:45
|
EGU25-15669
|
ECS
|
On-site presentation
Milton Gomez, Tom Beucler, Alexis Berne, and Louis Poulain--Auzéau

Numerical Weather Prediction (NWP) models, which integrate physical equations forward in time, are the traditional tools for simulating atmospheric processes and forecasting weather in modern meteorology. With recent advancements in deep learning, Neural Weather Models (NeWMs) have emerged as competent medium-range NWP emulators with reported performances that compare favorably to state-of-the-art NWP models. However, they are commonly trained on reanalysis with limited spatial resolution (e.g., 0.25° horizontal grid spacing) and thus smooth out key features associated with a number of weather phenomena. For example, tropical cyclones—among the most impactful weather events due to their devastating effects on human activities—are challenging to forecast, as extrema like wind gusts, which serve as proxies for tropical cyclone intensity, are smoothed in deterministic forecasts at 0.25° resolution. To address this, we use our best global observational estimate of wind gusts and minimum sea level pressure to train models that post-process NeWM outputs and enable accurate and reliable forecasts of TC intensity. We present a tracking-independent post-processing algorithm and show that even naïve, linear models extract useful information from NeWM model outputs beyond what is present in the initial conditions used to roll out NeWM predictions. We explore how the NeWM’s spatial context may further improve the forecast through masking and convolutional architectures. Our post-processing framework thus presents a step towards democratization of tropical cyclone intensity forecasting, given the reduction in computational requirements for producing global weather forecasts with NeWMs compared to traditional NWP approaches and the algorithmic simplicity of the tracking-independent approach.

How to cite: Gomez, M., Beucler, T., Berne, A., and Poulain--Auzéau, L.: Post-Processing Neural Weather Model Outputs for Tropical Cyclone Intensity Forecasts, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-15669, https://doi.org/10.5194/egusphere-egu25-15669, 2025.

Coffee break
16:15–16:25
|
EGU25-18447
|
ECS
|
On-site presentation
Graham Clyne, Guillaume Couairon, Juliette Mignot, Guillaume Gastineau, Anastase Charantonis, and Claire Monteleoni

Sampling from climate models to generate ensembles of predictions is computationally expensive (Hawkins et al., 2015). Climate model ensembles are used to understand probabilities of climatic events and identify internal variability in climate models. In the short term, model uncertainty and inter-annual variability dominate uncertainty in climate predictions (Smith et al., 2019). A typical approach to address these uncertainties is to use large ensembles of non-learned, physical numerical global circulation models (GCM) (Eade et al., 2014). These ensembles allow for statistical analysis of distributions and determination of internal variability in the climate model.

Our approach demonstrates that we can efficiently learn to emulate a GCM. We use ensembles generated by the IPSL submission to the Decadal Climate Prediction Project (DCPP). The dataset ranges from 1960-2016 and produces 10-member, 10-year forecast ensembles for each year. On this dataset, we train a modified version of ArchesWeatherGen, a Swin Transformer based on PanguWeather that can be used in a generative way using flow matching (Couairon et al. 2024). The model was modified to predict additional climatic variables (e.g. air temperature, specific humidity, ocean potential temperature at depth, sea surface temperature, sea level pressure) at a monthly temporal resolution. Once trained, the model probabilistically generates ensemble members rapidly which can be auto-regressively rolled out. We show that they are physically reliable via evaluation methods that assess physical processes derived from the variables represented in the machine learning model, such as by evaluating it on El Niño/La Niña events. This model demonstrates that machine learning can enhance climate models by expanding ensemble sizes to improve our understanding of climatic processes. We aim to output physically realizable month-to-month trajectories to estimate future climate and its uncertainties across various domains, including land, ocean, and atmospheric processes.



Couairon, G., Singh, R., Charantonis, A., Lessig, C., & Monteleoni, C. (2024). ArchesWeather & ArchesWeatherGen: a deterministic and generative model for efficient ML weather forecasting. arXiv preprint arXiv:2412.12971.

Eade, Rosie, Doug Smith, Adam Scaife, Emily Wallace, Nick Dunstone, Leon Hermanson, et Niall Robinson. « Do Seasonal-to-Decadal Climate Predictions Underestimate the Predictability of the Real World? » Geophysical Research Letters 41, no 15 (2014): 5620‑28. https://doi.org/10.1002/2014GL061146.

Hawkins, Ed, Robin S. Smith, Jonathan M. Gregory, et David A. Stainforth. « Irreducible Uncertainty in Near-Term Climate Projections ». Climate Dynamics 46, no 11 (1 juin 2016): 3807‑19. https://doi.org/10.1007/s00382-015-2806-8.

Smith, D. M., R. Eade, A. A. Scaife, L.-P. Caron, G. Danabasoglu, T. M. DelSole, T. Delworth, et al. « Robust Skill of Decadal Climate Predictions ». Npj Climate and Atmospheric Science 2, no 1 (17 mai 2019): 1‑10. https://doi.org/10.1038/s41612-019-0071-y.

How to cite: Clyne, G., Couairon, G., Mignot, J., Gastineau, G., Charantonis, A., and Monteleoni, C.: ArchesClimate: Ensemble Generation for Decadal Prediction using Flow Matching, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-18447, https://doi.org/10.5194/egusphere-egu25-18447, 2025.

16:25–16:35
|
EGU25-19942
|
On-site presentation
Maximilian Witte, Johannes Meuer, and Kadow Christopher

High-resolution machine learning faces the challenge of balancing local computation with large physical context windows. GPU memory limitations and the slow training process when distributing the model across multiple GPUs further complicate this task. 

We present a transformer model for high-resolution climate-related tasks that uses neural operators within a multi-grid architecture. This approach allows resolution independence, large physical context windows, and the handling of discontinuities such as coastlines.

The model is spatially flexible, supporting both regional and global training schemes. It is also independent of the number of input variables, allowing training to be scaled to large numbers of input variables.

We demonstrate the ability of the model to scale, both spatially and in terms of variables. The model forms the foundation of an approach to learn from the rich and diverse climate data available, enabling high-resolution downscaling, infilling, and predictions.

How to cite: Witte, M., Meuer, J., and Christopher, K.: A Spatial Multi-Grid Neural Operator-Transformer Mdoel for High-Resolution Climate Modeling, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-19942, https://doi.org/10.5194/egusphere-egu25-19942, 2025.

16:35–16:45
|
EGU25-9956
|
ECS
|
On-site presentation
Eliot Walt, Wessel Bruinsma, Maurice Schmeits, Efstratios Gavves, and Dim Coumou

Sub-seasonal to seasonal (S2S) timescales range from two weeks to three months and are crucial to make informed climate change-related decisions, including renewable energy resources allocation, extreme events’ risks mitigation, and the development of effective early warning systems. Unfortunately, traditional physics-based forecasting systems achieve poor skill on these lead times. Recently, deep learning (DL) has shown promising results in weather forecasting on timescales up to 10 days, reaching performance competitive with that of physical models. However, these DL approaches currently struggle on S2S timescales.  

Following previous studies on neural solvers for partial differential equations and weather forecasting, we propose a fine-tuning framework aimed at improving the S2S prediction skill of foundation weather models. Our approach has two core components. First, we implicitly condition the latent space embeddings to retain the predictable signals at a given lead time using an additional regression head. Second, we design a novel frequency-domain decoder and loss function to ensure spectral consistency. These steps should ensure that the model focuses on the most predictable frequencies. We apply this methodology to the recently published Aurora foundation model and propose Xaurora, standing for “extended Aurora”. Our fine-tuning approach represents an important milestone in data-driven S2S forecasting, addressing key challenges in the field while remaining broadly applicable with minimal assumptions on the underlying model’s architecture. 

The relevance of our framework is evaluated through ablation studies, comparing our spectral consistency fine-tuning to the original Aurora model. Furthermore, we provide standard deterministic and probabilistic skill scores on S2S timescales, as well as relevant teleconnection indexes. We present preliminary outputs of this analysis. 

How to cite: Walt, E., Bruinsma, W., Schmeits, M., Gavves, E., and Coumou, D.:  Xaurora: Advancing subseasonal-to-seasonal forecasting by fine-tuning foundation weather models with spectral consistency , EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-9956, https://doi.org/10.5194/egusphere-egu25-9956, 2025.

16:45–16:55
|
EGU25-15811
|
On-site presentation
Jenni Kontkanen, Mario Acosta, Pierre-Antoine Bretonnière, Miguel Castrillo, Paolo Davini, Francisco Doblas-Reyes, Barbara Früh, Jost von Hardenberg, Thomas Jung, Heikki Järvinen, Daniel Klocke, Nikolay Koldunov, Pekka Manninen, Sebastian Milinski, Jarmo Mäkelä, Devaraju Naraynappa, Suraj Polade, Irina Sandu, Outi Sievi-Korte, and Stephan Thober

The Climate Change Adaptation Digital Twin (Climate DT) is part of the Destination Earth (DestinE) initiative, developing Digital Twins of Earth to increase resilience against environmental changes. More specifically, Climate DT provides capabilities supporting climate change adaptation at regional and national levels at multi-decadal time scales. We present here an overview of Climate DT, highlighting the added value for the users and discussing the transition of the system towards the operations.  

The development of Climate DT has started in Phase 1 of DestinE, during which the first prototype of the new climate information system has been developed. A key innovation of Phase 1 was the introduction of a generic state vector (GSV), which is evolved by the Earth system models (ESMs) and streamed to applications from climate adaptation impact sectors.  This has created a basis for a pioneering climate information system that enables (i) provision of global climate information at an unprecedented granularity, (ii) scaling the system across a number applications that have access to all the data they need, (iii) user-centric approach with new ways of co-design and opportunities for enhancing interactivity. In Phase 2, which started in May 2024, our focus is on operationalizing Climate DT to deliver high-quality climate and impact-sector information regularly while incorporating new interactive features.

The operational model of the Climate DT is built around three storm- and eddy resolving ESMs; ICON, IFS-NEMO and IFS-FESOM. The operational framework utilizes a DevOps-like cycle, including three set-ups: d-suite for development, e-suite for testing the operational set-up and o-suite for operating the system. The o-suite simulations will provide data covering both past (1990-2020) and future periods (2020-2050) with a 5 km global grid. Additionally, capabilities for special simulations are developed, including story-line simulations for future periods of extremes as well as what-if scenario simulations enabling a new level of interactivity.

The added value of Climate DT to users is demonstrated through four impact sector applications. These applications operate on the streamed GSV as part of the operational framework, and they are improved in co-design with key users. The impact sector applications cover societally relevant climate change adaptation domains, including wind energy management, disaster risk management (with regards to wildfires and floods), as well as agriculture and water management. Climate DT output, including high-resolution climate simulations, storyline simulations, user-relevant indicators and impact assessments are made available to users via DestinE Service Portal.

How to cite: Kontkanen, J., Acosta, M., Bretonnière, P.-A., Castrillo, M., Davini, P., Doblas-Reyes, F., Früh, B., von Hardenberg, J., Jung, T., Järvinen, H., Klocke, D., Koldunov, N., Manninen, P., Milinski, S., Mäkelä, J., Naraynappa, D., Polade, S., Sandu, I., Sievi-Korte, O., and Thober, S.: Climate Adaptation Digital Twin – building an operational climate information system to support decision-making, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-15811, https://doi.org/10.5194/egusphere-egu25-15811, 2025.

16:55–17:05
|
EGU25-5110
|
On-site presentation
Roger Randriamampianina and the DestinE Extremes Digital Twin Team

Our presentation aims to describe the development and operationalisation of the Destination Earth (DestinE) Extremes Digital Twin (DT), including the On-Demand component, a system designed to improve the prediction and management of extreme weather events in Europe. The system leverages high-resolution weather models using information from Extreme Detection (EDF) and Triggering (DTF) Frameworks, as well as ECMWF ensemble, incorporating impact-specific models for hydrology, air quality, renewable energy, and more. A key component is a configuration lookup table prioritising end-user needs and available resources. The system incorporates various masking techniques (ACCORD models configurations, geographical, capacity, event type) to refine forecasts. The presentation describes the system's architecture, data sources, and workflow, emphasising the integration of multiple models and data sources, and the use of cutting-edge technologies such as GPUs and machine learning for enhanced forecasting and efficient resource utilisation. Pilot regions are used for testing and operationalisation, with a phased approach planned for broader deployment. The project addresses challenges in forecasting accuracy, communication of uncertainty, and the integration of forecasts into decision-making processes across various sectors.

How to cite: Randriamampianina, R. and the DestinE Extremes Digital Twin Team: Digital twin for weather-induced extremes, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-5110, https://doi.org/10.5194/egusphere-egu25-5110, 2025.

17:05–17:15
|
EGU25-10817
|
ECS
|
On-site presentation
Sina Montazeri, Miruna Stoicescu, Oriol Hinojo Comellas, Danaële Puechmaille, and Michael Schick

Destination Earth (DestinE) is European Commission’s initiative to gradually develop highly accurate Digital Twins (DT)s of the Earth with unprecedented accuracy and resolution. DestinE will initially provide DTs for adapting to climate change, forecasting extreme events and interactive use of high-resolution climate data. Insights from these models support scientists and policymakers to study and plan for future weather- and climate-induced events. 

Stakeholders implementing what-if scenarios and/or ready to use applications on DestinE require the optimum storage and the seamless provision of access to a sheer volume of heterogeneous data often available from different data origins. EUMETSAT has implemented the DestinE Data Lake (DEDL) to address the above challenges. The DEDL offers the Harmonised Data Access (HDA) service that enables access to diverse data from the DEDL data portfolio via a unified STAC API. Furthermore, it offers, for power users, DEDL edge services on request, which are a dynamic suite of distributed big data processing components that operate close to DestinE’s massive data repositories. The edge services offered  are:  STACK (DEDL-managed software applications such as JupyterHub, DASK and Open Data Cube), ISLET (project-managed compute and storage services such as configurable virtual machines and S3 object storage) and HOOK (schedule and run pre-defined or user-defined high-level workflows, such as setting up a data processing pipeline). 

To efficiently exploit the wealth of data available on DestinE, DEDL edge services will extend their abilities to accommodate the necessary infrastructure and software to enable Artificial Intelligence/Machine Learning (AI/ML) activities. DEDL will offer an ML Operations (MLOps) service tailored to Earth Observation (EO) data, which allows users to engage in various steps of AI/ML such as data preprocessing, model training and evaluation, experiment tracking, model deployment, model inference and monitoring. The modularized DEDL MLOps architecture will allow the users to use components as required without the need to be bound to pre-defined workflows and pipelines. The users, furthermore, can develop their AI/ML algorithms according to CI/CD best practices and have multiple environments for development, staging and production. 

A specific focus of DEDL will be to define and work with highly flexible data pipelines. The framework will allow to convert DestinE data portfolio datasets to AI-ready formats, which can readily be used as inputs for various AI/ML models. The framework will have the capability to combine and harmonise data from various sources and formats and provides typical EO-based pre-processing steps such as data collocation, re-projection, and re-gridding among other operations. 

This presentation will highlight the AI/ML and MLOps capabilities of the DEDL, demonstrating how they empower users to efficiently analyse data and derive valuable insights. By seamlessly integrating with DestinE’s data ecosystem, these advancements enable users to focus on innovation and address critical challenges such as climate adaptation and extreme event forecasting, rather than on managing complex workflows or infrastructure. 

How to cite: Montazeri, S., Stoicescu, M., Hinojo Comellas, O., Puechmaille, D., and Schick, M.: MLOps on DestinE Data Lake – Towards Reproducible AI on Edge Services, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-10817, https://doi.org/10.5194/egusphere-egu25-10817, 2025.

17:15–17:25
|
EGU25-12349
|
On-site presentation
Patryk Grzybowski, Marcin Ziółkowski, Aubin Lambare, Christoph Reimer, and Michael Schick

Destination Earth (DestinE) is a flagship initiative led by the European Commission, implemented by EUMETSAT, ESA and ECMWF. It aims to create highly detailed Digital Twins (DTs) of the Earth, enabling precise simulations for a variety of uses. Currently, the initiative focuses on two primary Digital Twins:  the Weather Extremes Digital Twin (ExtremeDT) and the Climate Change Adaptation Digital Twin (ClimateDT). Over the coming years, the scope of DTs is set to expand, necessitating improved access to data and streamlined methods for working with it. This is where the Destination Earth Data Lake (DEDL) plays a pivotal role, offering comprehensive data discovery, access, and processing services tailored to the needs of DestinE users.

The DEDL operates on two key levels: ‘Data Discovery and Access’ and ‘Edge Services’. DEDL Discovery and Data Access services is provided by Harmonized Data Access (HDA) tool which provides a single, federated entry point to the services and data, including resources from existing datasets and complementary sources such as in-situ and socio-economic data. Notably, it also provides access to the unique datasets generated by DestinE’s DT’s. The services rely on use of the SpatioTemporal Asset Catalogs (STAC) standard which means:

  • The search in the dataset is done according to the STAC protocol;
  • The Federated Catalog search proxy component converts STAC queries into queries adapted to the underlying catalog and returns the results to the user in STAC format.

The cloud computing service is powered by the ISLET infrastructure, a distributed Infrastructure as a Service (IaaS) built on OpenStack. It allows users to manage virtual machines, s3 storage, and run advanced computations via a graphical user interface or command-line interface. A standout feature of ISLET is its proximity to data sources, operating near High-Performance Computing (HPC) facilities. This is achieved through data bridges, enabling efficient processing of large datasets, including those from Digital Twins, in conjunction with HPC systems.

The STACK environment supports application development using JupyterHub and DASK, with Python, and R languages. Users can create DASK clusters on selected infrastructure (sites) to process data directly where it resides, removing the need for extensive local setup and optimization.

Hook Services is a set of pre-defined workflows which could be used by users as a ready-to-use processors like: Sentinel-2: MAJA Atmospheric Correction; Sentinel-1: Terrain-corrected backscatter. It also enables workflow functions to generate on-demand higher-level products, such as temporal composites.

DEDL is a transformative initiative that revolutionizes how Earth Observation data is managed and utilized. By integrating innovative infrastructure (ISLET), data services (HDA), reliable processors (Hook Services), and user-friendly development tools (STACK), DEDL enables unprecedented levels of data harmonization, federation, and processing. Moreover, the DEDL plays a crucial role in empowering DestinE users by providing them with seamless access to vast datasets and advanced computational tools. It simplifies the process of data exploration, integration, and analysis, enabling researchers, policymakers, and developers to focus on innovation and decision-making rather than technical barriers. This cutting-edge system enhances climate research capabilities and supports sustainable development efforts on a scale previously unattainable.

How to cite: Grzybowski, P., Ziółkowski, M., Lambare, A., Reimer, C., and Schick, M.: How Destination Earth Data Lake support Destination Earth users, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-12349, https://doi.org/10.5194/egusphere-egu25-12349, 2025.

17:25–17:35
|
EGU25-19238
|
On-site presentation
Charlotte Delmas, Aubin Lambaré, and Arnaud Le Carvennec

The Danube Delta, the second-largest delta in Europe and a critical economic waterway, represents a dynamic yet fragile ecosystem requiring effective preservation strategies. Monitoring water reservoirs is crucial for both ecological sustainability and socio-economic management. The increasing availability of diverse datasets from multiple sources offers new opportunities to enhance real-time observation and forecasting efforts. 

Implemented by EUMETSAT, the Destination Earth Data Lake (DEDL) provides seamless access to these datasets and integrates high-performance computing for complex scientific modeling. Its edge services provide efficient, scalable data processing, empowering researchers to analyze environmental phenomena with speed and precision. 

Leveraging DEDL services enables to consolidate key hydrological datasets offering important features to monitor the ecosystem’s health state: 

  • Daily live in situ data: Real-time measurements of water level, temperature, and discharge from DanubeHIS ground stations along the river and its delta, has been retrieved via the DEDL Harmonized Data Access (HDA). 
  • Outputs from existing scientific algorithms: The integration and evolution of the Surfwater algorithm within the DEDL environment allows leveraging Earth Observation data (Landsat 8/9) to detect water bodies in the area. This makes it possible to generate time series of surface areas, volumes, and fill rates of water bodies within the region. 
  • Hourly radar data: Rainfall rates are computed using OPERA radar observations on the European Weather Cloud instances. 
  • Precipitation forecasts: Predictive data from ECMWF (Destination Earth Digital Twin Outputs), accessed via HDA, are leveraged to provide valuable forecasting insights. 

The outcome of those algorithms and analysis are provided live through a dashboard. By enabling cross-referencing of diverse data streams, it allows stakeholders to obtain a complete view of the Danube Delta’s environmental conditions, supporting informed decision-making for ecosystem preservation. 

Leveraging advanced geoscience tools, this integrated approach highlights the transformative power of modern data platforms in tackling global environmental challenges. 

How to cite: Delmas, C., Lambaré, A., and Le Carvennec, A.: Destination Earth Data Lake user story on Danube Delta water reservoir , EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-19238, https://doi.org/10.5194/egusphere-egu25-19238, 2025.

17:35–17:45
|
EGU25-20187
|
On-site presentation
Joana Mendes, Edward Pope, Zorica Jones, Andrew Cottrell, Michael Eastman, Joshua Wiggs, Hannah Findley, Angela Heard, Paul Hallett, Emilie Vanvyve, Remy Vandaele, Hywell Williams, Milto Miltiadou, Finley Gibson, Kirstine Dale, Anna Angus-Smyth, Simon Gardner, and Sam Tailby

Digital twins are an exciting and rapidly developing research area, with the potential to provide a step change in the way we understand our evolving environment and its impact on sensitive systems.

The TWInning Capability for the Natural Environment (TWINE) programme is being co-delivered by the Met Office and Natural Environment Research Council (NERC) to explore the potential of this technology for transforming environmental science and across priority areas including climate change adaptation and mitigation, biodiversity and ecosystems, and natural hazards and their mitigation.

Through the TWINE programme, NERC and the Met Office have funded six digital twin pilot projects across a range of applications, including harmful algal blooms, flooding and coastal overtopping, optimising data collected by ocean gliders and aircraft, and multi-objective land-use decisions.

We will introduce the TWINE programme, giving a brief overview of the projects which are advancing our understanding of how we can harness the potential of digital twinning technology. These include cross-sector challenges such as: risk to natural resources, Net Zero targets, and addressing the science-to-policy lag.

How to cite: Mendes, J., Pope, E., Jones, Z., Cottrell, A., Eastman, M., Wiggs, J., Findley, H., Heard, A., Hallett, P., Vanvyve, E., Vandaele, R., Williams, H., Miltiadou, M., Gibson, F., Dale, K., Angus-Smyth, A., Gardner, S., and Tailby, S.: TWINE: TWInning capability for the Natural Environment, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-20187, https://doi.org/10.5194/egusphere-egu25-20187, 2025.

17:45–17:55
|
EGU25-2622
|
On-site presentation
 Integrated Digital Exploratory Analysis System (IDEAS) – An Open-Source Software Framework for Digital Twins
(withdrawn)
Thomas Huang

Posters on site: Mon, 28 Apr, 10:45–12:30 | Hall X4

The posters scheduled for on-site presentation are only visible in the poster hall in Vienna. If authors uploaded their presentation files, these files are linked from the abstracts below.
Display time: Mon, 28 Apr, 08:30–12:30
X4.44
|
EGU25-13817
Minchao Wu, Torbern Tagesson, Zhanzhang Cai, and Zheng Duan

Tropical deciduous ecosystems play a critical role in terrestrial ecological processes and the global carbon cycle, influencing seasonal climates through phenology-induced biophysical and biogeochemical feedbacks. Phenological processes for tropical deciduous ecosystems are complex with multiple intertwining climatic and physiological factors that co-shape the underlying dynamics. Here, we present a deep learning framework based on Temporal Fusion Transformer for predicting tropical deciduous phenology globally in real-time with high accuracy. The framework integrates long-term AVHRR-derived vegetation greenness data, high-resolution climate data from ERA-Land, and land surface features including physical and chemical properties to account for terrestrial spatial heterogeneities that affect phenological processes. Our preliminary results demonstrate the ability of the developed framework to accurately predict historical phenological dynamics across 35 growing seasons in the pan-tropical regions. Key phenological metrics, including the start, peak, and end of the growing season, are identified with high accuracy. We believe the framework provides a powerful tool for real-time predictions and reconstructions of phenological states for tropical deciduous ecosystems, especially in regions where human activities like deforestation and agriculture heavily influence the estimates of tropical carbon cycle potential. With insight into the potential phenological states, this framework may help inform sustainable land management practices in pan-tropical regions.

How to cite: Wu, M., Tagesson, T., Cai, Z., and Duan, Z.: Real-time Prediction of Global Tropical Deciduous Ecosystem Phenology with Deep Learning, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-13817, https://doi.org/10.5194/egusphere-egu25-13817, 2025.

X4.45
|
EGU25-19021
|
ECS
|
Erik Pavel, Michael Langguth, Martin G. Schultz, Christian Lessig, Stefanie Hollborn, Jan Keller, Roland Potthast, Britta Seegebrecht, Sabrina Wahl, Juergen Gall, Anas Allaham, Mohamad Hakam Shams Eddin, and Ilaria Luise

Data-driven weather prediction models based on deep learning have been on the rise for several years and have outperformed traditional physics-based numerical models in various benchmark forecasting scores. However, a significant challenge remains: accurately predicting extreme events on a local scale, such as thunderstorms and wind gusts. Previous models struggle in this area, as they were primarily developed for medium-range forecasting and operate at relatively coarse spatio-temporal resolutions. However, the capability of weather models to predict extreme events at a local level is essential for preventing severe consequences for communities, ecosystems, and the financial and material losses they entail. Recently, task-agnostic foundation models, trained on extensive and diverse datasets using self-supervised methods, have demonstrated remarkable skill and robustness, especially in their ability to generalize to rare extreme events. 

The RAINA project aims to develop a foundation model for the atmosphere, with an emphasis on delivering reliable, high-resolution forecasts of extreme wind and precipitation events. In partnership with the EU Horizon-funded WeatherGenerator project, which aims to create advanced digital twins for Destination Earth, RAINA will extend the pioneering AtmoRep model (Lessig et al., 2023) by employing a multi-modal learning approach.
The foundation model seeks to develop a comprehensive, statistically robust, and multi-scale understanding of atmospheric dynamics by incorporating a wide range of meteorological datasets from both models and observations. Innovative deep learning methods, including diffusion models and test-time adaptation, will be investigated to facilitate short-range forecasts of temperature, wind, and precipitation at kilometer-scale resolution over Germany.

In a first demonstrator, short-range forecasts are generated using the AtmoRep model and subsequently refined with the CorrDiff downscaling approach (Mardani et al., 2024) that combines a generative diffusion model with a residual approach. This two-step strategy delivers high-resolution forecasts with a maximum lead time of six hours while disentangling uncertainties inherent in the forecasting and downscaling processes, a separation that can enhance training quality when properly applied. By using ERA5 and COSMO REA2 reanalysis data, the approach enhances the precision of high-resolution forecasts over Germany. 
Initial results from the first demonstrator will be presented in a poster, along with the overall timeline and key milestones of the RAINA project.

How to cite: Pavel, E., Langguth, M., Schultz, M. G., Lessig, C., Hollborn, S., Keller, J., Potthast, R., Seegebrecht, B., Wahl, S., Gall, J., Allaham, A., Shams Eddin, M. H., and Luise, I.: RAINA - High-resolution nowcasting of precipitation and wind extremes with a foundation model for the atmosphere, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-19021, https://doi.org/10.5194/egusphere-egu25-19021, 2025.

X4.46
|
EGU25-16981
|
ECS
|
David Landry, Anastase Charantonis, and Claire Monteleoni

Weather forecast downscaling, the problem of recovering accurate local predictions given a lower resolution forecast,  is commonly used in operational NWP pipelines. Its purpose is to recover some of the sub-grid processes that could not be represented by the underlying numerical model due to a limited resolution. This misrepresentation provokes statistical mismatches between the observation data gathered from stations and the nearest grid point in the numerical simulation.

Using a downscaling model typically requires making a compromise between spatial consistency and statistical calibration. Traditionally, these models are trained to target a traditional verification metric. Consequently, they suffer from the double penalty issue and fail to correctly model spatial correlation structures by becoming overly smooth. This is detrimental to downstream modeling tasks such as power grid management, which require a good assessment of spatially-correlated phenomena. 

Recently, the finer details of the atmospheric state have successfully been recovered using generative models such as denoising diffusion [2-4]. We propose a similar strategy for in situ downscaling by introducing a flow matching [1] model for that task. A cross-attention transformer [5] backbone allows us to build an internal representation for the gridded numerical forecast as well as the in situ downscaled forecast. 

Our model avoids the numerical instability and mode collapse issues related to Generative Adversarial Networks. It produces well-calibrated forecasts that better represent the spatial correlations between the stations when compared to non-generative alternatives. Our model makes no assumptions about the underlying forecast, and thus can be thought of in two ways. It can be considered a hybrid NWP/AI model, where we first run a numerical simulation and then downscale it. It can also be considered a supplementary forecasting product in a full machine learning pipeline.

Using our flow matching weather forecast downscaling model, we run experiments on the EUPPBench post-processing dataset to predict surface temperature and wind speed. Particular care is given to evaluating the model, where we assess both the marginal performance (via the CRPS, reliability histogram, and spread-error ratio) and the joint performance (via the Energy Score, local Variogram Score and forecast spatial frequency content). The accurate representation of extreme events is evaluated using Brier scores. Further experiments discuss the pitfalls of fitting the Energy Score directly without a generative model.

 

[1] Lipman, Y. et al. (2023) ‘Flow Matching for Generative Modeling’. arXiv. Available at: https://doi.org/10.48550/arXiv.2210.02747.

[2] Couairon, G. et al. (2024) ‘ArchesWeather & ArchesWeatherGen: a deterministic and generative model for efficient ML weather forecasting’. arXiv. Available at: https://doi.org/10.48550/arXiv.2412.12971.

[3] Price, I. et al. (2023) ‘GenCast: Diffusion-based ensemble forecasting for medium-range weather’. arXiv. Available at: https://doi.org/10.48550/arXiv.2312.15796.

[4] Lang, S. and Chantry, M. (2024) ‘Enter the ensembles’, AIFS Blog, 21 June. Available at: https://www.ecmwf.int/en/about/media-centre/aifs-blog/2024/enter-ensembles (Accessed: 15 January 2025).

[5] Vaswani, A. et al. (2017) ‘Attention is All you Need’, in Advances in Neural Information Processing Systems. Curran Associates, Inc. Available at: https://proceedings.neurips.cc/paper/2017/hash/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html (Accessed: 17 May 2022).

How to cite: Landry, D., Charantonis, A., and Monteleoni, C.: Flow matching for in situ, spatially consistent weather forecast downscaling, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-16981, https://doi.org/10.5194/egusphere-egu25-16981, 2025.

X4.47
|
EGU25-6965
|
ECS
Patrick Ebel, Linus Magnusson, and Rochelle Schneider

Total precipitation is a key variable of the weather state, accumulated over a given period. Beyond their direct relevance, high-quality precipitation data are of importance for driving downstream applications in hydrology, e.g. river streamflow and runoff forecasting. However, common measurements of precipitation are either precise but sparse (as for in-situ recordings) or global but uncertain (as for spaceborne observations). Though reanalysis products such as ECMWF’s ERA5 provide a best estimate of the state of the atmosphere, the quality of their total precipitation reconstruction is imperfect. Following reports that ERA5 is prone to overestimating the occurrence of drizzle at the cost of underestimating extreme precipitation, prior work explored data-driven models for local post-processing to address the latter. However, the local models employed in preceding work do not easily extend to a global post-processing setup and an exclusive emphasis on outliers limits the ability to represent the full distribution of precipitation intensity, which limits their relevance.

 

In this work, we propose a novel approach for precipitation post-processing which models the entire globe in a single forward pass and models dryness, light rain and heavy rain alike. The post-processer is based on a graph neural network architecture, trained on decades of gauge-calibrated multi-source weighted estimates of precipitation. We demonstrate that our model learns to bias-correct ERA5 total precipitation information and consistently improves upon the baseline while maintaining its global applicability. Further experiments will detail the nature of its improvements and may explore its benefits for downstream applications.  

How to cite: Ebel, P., Magnusson, L., and Schneider, R.: Global post-processing of ERA5 precipitation product via graph-based neural networks , EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-6965, https://doi.org/10.5194/egusphere-egu25-6965, 2025.

X4.48
|
EGU25-1299
Thibault Xavier, Dawa Derksen, Vincent Martin, and Pierre-Marie Brunet

The digital twin is a useful tool for scientists and decision makers to understand the present (what now), explore future trajectories (what next) to to investigate the future impacts of current risk mitigation actions (what if), or of a system. Working at the local scale allows detailed physics to be implemented in an approach that better captures the complexity of the study site (city, watershed, etc.) in an approach that complements the global scale. The availability of very high-precision spatial products (optical, 3D, thermal, etc.) enables this high-precision local analysis anywhere on the Earth.This growing interest is leading a number of actors to build digital twins at the local scale. However, building this type of representation requires a dedicated effort from the user, usually a scientist, which prevents him from focusing on the scientific added value he could bring with his thematic expertise.
The Digital Twin Factory (DTF, 2024-2026) project, coordinated by the French National Centre of Space Studies (CNES), aims to provide users with a framework capable of building, deploying and operating a digital twin at the scale of the site. It is designed as a Digital Twin as a Service API (PaaS) to abstract the underlying infrastructure, with possibility of accessing both the HPC resources and usual Cloud providers. The DTF also provides users with methodological building blocks to access (catalogue harvester), manipulate (ingester, data processing pipeline), visualize and analyze (plot, dashboarding) the data. In this way, the instantiators of the digital twin can focus on their thematic expertise and deploy their physical solvers with access to multi-source data.
While high performance computing resources can be made available to run these physical models, parametric studies or climate trajectories may require high cost and long simulation times. Partial or full data based surrogate model is an approach that can overcome this barrier and provide results in a reactive manner. Part of the DTF's work is therefore aimed at providing users with methodological building blocks for surrogate modelling, based on the expertise of the scientific community.
This contribution presents the multi-layered architecture of the DTF project, its different components and the services offered to users. We illustrate this work with the construction of the digital Twin of Nokoué Lake in Benin that integrates flood forecasting, pollution control, salinity management, long-term risk evolution, risk governance, and adaptation measures. Satellite data are used as input for a hydrodynamic code, on which first developments of surrogate models are presented.

How to cite: Xavier, T., Derksen, D., Martin, V., and Brunet, P.-M.: Building a framework for the design and deployment of digital twins: the Digital Twin Factory project, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-1299, https://doi.org/10.5194/egusphere-egu25-1299, 2025.

X4.50
|
EGU25-12042
Implementing FAIR Agrobiodiversity Workflows for the Destination Earth Data Lake
(withdrawn)
Claus Weiland, Daniel Bauer, Desalegn Chala, Dag Endresen, Jonas Grieb, Marcella Orwick Rydmark, and Gabriela Zuquim
X4.51
|
EGU25-15162
|
ECS
Younghun Kim and Giha Lee

Accurate rainfall prediction is essential not only for water resource management but also for forecasting and mitigating the impacts of climate change-driven weather events such as floods and droughts. Due to the high spatiotemporal variability of complex meteorological phenomena like rainfall, effective prediction necessitates in high-quality data collection, model application, and uncertainty analysis. Unlike existing studies that focus primarily on developing deep learning models to improve rainfall prediction accuracy, this study evaluates the uncertainty of rainfall predictions using pre-existing deep learning models, U-Net and ConvLSTM, with artificially generated elliptical rainfall data. Artificial rainfall data were designed with four temporal patterns: constant, gradually increasing, gradually decreasing, and time-varying. These patterns were applied in horizontal, vertical, and diagonal movements to evaluate the models' ability to handle spatiotemporal complexity. The results indicate that both deep learning models exhibited spatial smoothing issues on rainfall predictions over time. However, the U-Net model demonstrated superior spatiotemporal performance compared to ConvLSTM. While this study focuses solely on deep learning models for rainfall prediction, future research will consider factors such as data complexity and loss functions to conduct a comprehensive evaluation of prediction uncertainty. This work is expected to contribute to the development of methodologies for rainfall modeling using deep learning approaches.

 

Funding: This research was supported by Disaster-Safety Platform Technology Development Program of the National Research Foundation of Korea(NRF) funded by the Ministry of Science and ICT. (No. 2022M3D7A1090338)

How to cite: Kim, Y. and Lee, G.: Uncertainty Evaluation of Deep Learning Models Using an Artificial Rainfall, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-15162, https://doi.org/10.5194/egusphere-egu25-15162, 2025.

X4.52
|
EGU25-8155
|
ECS
|
Kexin Zhu and Weixin Xu

Developing region-specific radar quantitative precipitation estimation (QPE) products for South China (SC) is crucial due to its unique climate and complex terrain over there. Deep learning (DL) has emerged as a promising avenue for radar QPE, especially graph neural networks (GNNs). Many studies have tested the DL models in radar QPE, but virtually no studies have evaluated the performance of DL models in different precipitation intensities, types, or organizations. Moreover, limited attention has been given to whether DL-based methods can mitigate radar QPE errors caused by orographic influences in complex terrains, such as those in SC.

This study investigates the advantages of DL methods for QPE tasks in South China, utilizing nearly three years of hourly gauge data as labels and ground-based radar reflectivity as inputs. Firstly, multi-layer perceptron (MLP), Convolutional Neural Networks (CNNs), and GNNs with similar architectures are constructed and compared to traditional Z-R relationships considering precipitation types. DL methods outperform traditional Z-R relationships and GNNs perform the best. More importantly, this study conducts a systematic evaluation of the proposed GNN. For extreme precipitation (>30 mm/h), GNN achieves the smallest MAE, highlighting its potential for hazardous event estimation. It also demonstrates stable performance for stratiform and organized precipitation, with minimal bias and standard deviation. However, GNN is less effective for isolated precipitation, whereas CNNs are a better choice due to their ability to estimate scattered rainfall accurately. Last but not least, the Z-R relationship shows systematic spatial biases, overestimating precipitation in coastal plains and underestimating it in inland high-altitude regions. DL methods alleviate these terrain-induced biases by incorporating spatial information. Overall, this study highlights the advantages of DL methods across different precipitation scenarios and demonstrates their ability to mitigate systematic biases from complex terrain.

How to cite: Zhu, K. and Xu, W.: Deep Learning for Radar Quantitative Precipitation Estimation over Complex Terrain in Southern China, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-8155, https://doi.org/10.5194/egusphere-egu25-8155, 2025.

X4.53
|
EGU25-14658
HongKe Cai, YaQin Mao, XuanHao Zhu, YunFei Fu, and RenJun Zhou

Based on the TRMM dataset, this paper compares the applicability of the improved MCE (minimum circumscribed ellipse), MBR (minimum bounding rectangle), and DIA (direct indexing area) methods for rain cell fitting. These three methods can reflect the geometric characteristics of clouds and apply geometric parameters to estimate the real dimensions of rain cells. The MCE method shows a major advantage in identifying the circumference of rain cells. The circumference of rain cells identified by MCE in most samples is smaller than that identified by DIA and MBR, and more similar to the observed rain cells. The area of rain cells identified by MBR is relatively robust. For rain cells composed of many pixels (N > 20), the overall performance is better than that of MCE, but the contribution of MBR to the best identification results, which have the shortest circumference and the smallest area, is less than that of MCE. The DIA method is best suited to small rain cells with a circumference of less than 100 km and an area of less than 120 km2, but the overall performance is mediocre. The MCE method tends to achieve the highest success at any angle, whereas there are fewer “best identification” results from DIA or MBR and more of the worst ones in the along-track direction and cross-track direction. Through this comprehensive comparison, we conclude that MCE can obtain the best fitting results with the shortest circumference and the smallest area on behalf of the high filling effect for all sizes of rain cells.

How to cite: Cai, H., Mao, Y., Zhu, X., Fu, Y., and Zhou, R.: Comparison of the Minimum Bounding Rectangle and Minimum Circumscribed Ellipse of Rain Cells from TRMM, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-14658, https://doi.org/10.5194/egusphere-egu25-14658, 2025.

X4.54
|
EGU25-14783
|
ECS
Zhenze Liu

We propose a simple yet effective framework for real-time surface ozone forecasting using deep learning. The framework highlights three key modules: independent channel encoders, frequency information extraction, and fine-tuning, all of which consistently enhance model performance. This unified model is built well to autonomously capture different spatial and temporal patterns of ozone concentrations, with an averaged RMSE of 8 ppb for day 1 forecasting. The performance of day 4 forecasting is slightly lower. We find that chemistry becomes less important than meteorology over time, indicating their different roles in short-term and long-term forecasting. Most high ozone episodes can be simulated, though capturing extremely high ozone values remains a challenge. Observations from China are trained and tested to demonstrate our model.

How to cite: Liu, Z.: A Unified Model of Forecasting Ozone by Deep Learning, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-14783, https://doi.org/10.5194/egusphere-egu25-14783, 2025.

Posters virtual: Tue, 29 Apr, 14:00–15:45 | vPoster spot 4

The posters scheduled for virtual presentation are visible in Gather.Town. Attendees are asked to meet the authors during the scheduled attendance time for live video chats. If authors uploaded their presentation files, these files are also linked from the abstracts below. The button to access Gather.Town appears just before the time block starts. Onsite attendees can also visit the virtual poster sessions at the vPoster spots (equal to PICO spots).
Display time: Tue, 29 Apr, 08:30–18:00
Chairpersons: Filippo Accomando, Andrea Vitale

EGU25-20616 | ECS | Posters virtual | VPS19

LUCIE: A Lightweight Uncoupled ClImate Emulator with long-term stability and physical consistency for O(1000)-member ensembles 

Haiwen Guan, Troy Arcomano, Ashesh Chattopadhyay, and Romit Maulik
Tue, 29 Apr, 14:00–15:45 (CEST)   vPoster spot 4 | vP4.13

We present LUCIE, a data-driven atmospheric emulator that remains stable during autoregressive inference for a thousand of years with minimal drifting climatology. LUCIE was trained using 9.5 years of coarse-resolution ERA5 data, incorporating 5 prognostic variables, 2 forcing variables, and one diagnostic variable (6-hourly total precipitation), all on a single A100 GPU over a two-hour period. LUCIE autoregressively predicts the prognostic variables and outputs the diagnostic variables similar to AllenAI’s ACE climate emulator. Unlike all the other state-of-the-art AI weather models, LUCIE is neither unstable nor does it produce hallucinations that result in unphysical drift of the emulated climate. The low computational requirements of LUCIE allow for rapid experimentation including the development of novel loss functions to reduce spectral bias and improve tails of the distributions. Furthermore, LUCIE does not impose true sea-surface temperature (SST) from a coupled numerical model to enforce the annual cycle in temperature. We demonstrate the long-term climatology obtained from LUCIE as well as subseasonal-to-seasonal scale prediction skills on the prognostic variables. LUCIE is capable of 6000 years of simulation per day on a single GPU, allowing for O(100)-ensemble members for quantifying model uncertainty for climate and ensemble weather prediction.

How to cite: Guan, H., Arcomano, T., Chattopadhyay, A., and Maulik, R.: LUCIE: A Lightweight Uncoupled ClImate Emulator with long-term stability and physical consistency for O(1000)-member ensembles, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-20616, https://doi.org/10.5194/egusphere-egu25-20616, 2025.

EGU25-14821 | ECS | Posters virtual | VPS19

Simulation of Monthly Global Sea Surface Temperature Data using Ensemble GAN Model 

Deepayan Chakraborty and Adway Mitra
Tue, 29 Apr, 14:00–15:45 (CEST) | vP4.14

Synthetic data has become an indispensable tool in climate science, offering extensive spatio-temporal
coverage to address data limitations in both current and future scenarios. Such synthetic data, derived
from climate simulation models, must exhibit statistical consistency with observational datasets to ensure
their utility. Among global climate simulation initiatives, the Coupled Model Intercomparison Project
Phase 6 (CMIP6) represents the latest and most comprehensive suite of General Circulation Models
(GCMs). However, the substantial High Performance Computing (HPC) resources required for these
physics-based models limit their accessibility to a broader research community. In response, genera-
tive machine learning models have emerged as a promising alternative for simulating climate data with
reduced computational demands.
This study introduces an ensemble model based on the Pix2Pix conditional Generative Adversarial
Network (cGAN) to generate high-resolution spatio-temporal maps of monthly global Sea Surface Tem-
perature (SST) with significantly lower computational cost and time. The proposed model comprises two
components: the GAN, which produces simulated SST climatology data , and the Predictor, which is
trained with the variability of the data that forecasts SST anomaly for the subsequent month using the
output data from the previous month. Both components contain the same architecture, but the training
processes are different. The predictor model can be fine-tuned with observed data for some epochs to
adopt its domain.
The ensemble model was calibrated with monthly SST observations from the COBE dataset as in-
put and output. The Empirical Orthogonal Functions (EOF) shows the model’s ability to simulate the
variabilty of the observed data. The model’s performance was evaluated using the temporal Pearson cor-
relation coefficient and mean squared error (MSE). Results demonstrate that the ensemble cGAN model
generates maps with statistical characteristics closely matching those of CMIP6 simulations and obser-
vations, achieving a mean temporal correlation coefficient around 0.5 and an MSE around 1.13 for both
cases.

How to cite: Chakraborty, D. and Mitra, A.: Simulation of Monthly Global Sea Surface Temperature Data using Ensemble GAN Model, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-14821, https://doi.org/10.5194/egusphere-egu25-14821, 2025.

EGU25-7258 | ECS | Posters virtual | VPS19

Leveraging Pretrained Deep Learning Models to Extract Similarities for the Analog Ensemble Method Applied to Convection Satellite Imagery 

Badreddine Alaoui, Chakib Bounoun, and Driss Bari
Tue, 29 Apr, 14:00–15:45 (CEST) | vP4.15

Severe convection, including thunderstorms and related phenomena like flash flooding, hail, and strong winds, can have significant socioeconomic impacts. Nowcasting, which provides real-time, short-term predictions, is vital for issuing timely warnings to mitigate these impacts. Satellite imagery is essential for monitoring convection and offering accurate predictions of storm evolution, thereby enhancing early warning systems. Ensemble forecasting, which generates multiple potential scenarios, helps better quantify uncertainties in nowcasting. However, most ensemble forecasting methods are computationally intensive and typically do not incorporate satellite images directly. The Analog Ensemble (AnEn) method, a lower-cost ensemble approach, identifies similar past weather events based on forecast data. For a given time and location, the AnEn method identifies analogs from past model predictions that are similar to current forecast conditions. Then their associated observations are used as ensemble members. Despite its advantages, AnEn struggles with locality and is sensitive to the choice of similarity metrics. This study presents an improved AnEn system that replaces forecast archives with satellite images to identify analogs of convective conditions. The system utilizes pretrained deep learning algorithms (VGG16, Xception, and Inception-ResNet) to assess image similarity. The training dataset consists of daily convection satellite images from EUMETSAT for the period 2020-2023, and the domain covers 40°N to 20°S and -20°W to 4°E. The year 2024 is used for testing, with ERA5 reanalysis of total precipitation as the verification ground-state. For a present convection satellite image this image is encoded and compared to all past encoded images of the training period using different metrics. The most similar images to the current one are then selected and their associated ERA5 total precipitation reanalysis are considered the members or our ensemble. Preliminary results indicate an average maximum precipitation anomaly of 15 mm between the analog ensemble mean and the current reanalysis, showing that the proposed system offers promising improvements in short-term forecasting.

Key words: Convection; Ensemble Forecasting; Deep Learning; VGG; Xception; ResNet; Analog Ensemble; Morocco; Nowcasting; EUMETSAT; ERA5; Morocco; Satellite Images; Remote Sensing;

How to cite: Alaoui, B., Bounoun, C., and Bari, D.: Leveraging Pretrained Deep Learning Models to Extract Similarities for the Analog Ensemble Method Applied to Convection Satellite Imagery, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-7258, https://doi.org/10.5194/egusphere-egu25-7258, 2025.

EGU25-14839 | ECS | Posters virtual | VPS19

Leveraging MAUNet for Bias Correction of TRMM Precipitation Estimates 

Sumanta Chandra Mishra Sharma and Adway Mitra
Tue, 29 Apr, 14:00–15:45 (CEST) | vP4.16

Deep neural networks have revolutionized various fields due to their remarkable adaptability, enabling them to address related tasks through retraining and transfer learning. These capabilities make them invaluable tools for diverse applications, including climate and hydrological modeling. In an earlier work (Mishra Sharma et al., 2024), we introduced a novel neural network architecture, the Max-Average U-Net (MAUNet), which leverages Max-Average Pooling to downscale gridded precipitation data to higher spatial resolutions. The model demonstrated significant improvements in resolving finer-scale precipitation features, making it well-suited for climate data applications.

In this study, we utilized the MAUNet architecture to tackle the critical task of bias correction in satellite-based precipitation estimates. Bias correction is essential for improving the reliability of precipitation data derived from satellite missions, which often exhibit systematic discrepancies compared to ground-based measurements. Specifically, we focused on correcting biases in precipitation estimates from the Tropical Rainfall Measuring Mission (TRMM) by calibrating them against high-resolution, ground-based gridded datasets from the India Meteorological Department (IMD).

Our experimental results reveal that MAUNet effectively reduces biases in TRMM precipitation estimates, achieving significantly improved agreement with ground truth data. This success is attributed to the model’s robust feature extraction and reconstruction capabilities, which enable it to learn and correct systematic errors in satellite data. The findings also highlight the potential of advanced neural network architectures in addressing bias correction challenges.

This work underscores the utility of deep learning architectures in precipitation modeling, contributing to broader goals of improving the spatial distribution of precipitation estimates. By bridging the gap between satellite observations and ground truth, the MAUNet model offers a comprehensive solution for enhancing precipitation datasets, with significant implications for climate research, hydrological studies, and policy planning.

How to cite: Mishra Sharma, S. C. and Mitra, A.: Leveraging MAUNet for Bias Correction of TRMM Precipitation Estimates, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-14839, https://doi.org/10.5194/egusphere-egu25-14839, 2025.

EGU25-3702 | ECS | Posters virtual | VPS19

What if Data story telling was the corner stone for environmental digital twins? 

Faten EL outa and Guillaume Dechambenoit
Tue, 29 Apr, 14:00–15:45 (CEST) | vP4.25

Environmental digital twins face significant interdisciplinary challenges in their development and operation, particularly in managing complex environmental data and facilitating effective communication among diverse stakeholders. While these virtual representations of environmental systems offer powerful capabilities for monitoring and decision-making, they often struggle to bridge the communication gap between technical experts, decision-makers, and end-users.

Data storytelling is the practice of narrating messages derived from data to address specific needs and visually communicating these messages to an audience in an ordered manner that is easily understandable. Interestingly, digital twins share a similar objective: both aim to simplify and communicate complex data through intuitive and meaningful narratives.

Building on this shared characteristic, we propose an approach that adapts data storytelling techniques to the creation of digital twins. This abstract focuses on how data storytelling can enhance the creation and communication of digital twin data through visual formats tailored to specific audiences, addressing their unique needs to support monitoring, decision-making, and actionable insights.

This innovative integration of data storytelling and environmental digital twins establishes a comprehensive approach to address three key challenges:

  • Documenting and structuring the development process  from data to communication to incorporate stakeholder needs and communication requirements from the outset.
  • Facilitating collaboration among interdisciplinary teams through shared narrative frameworks.
  • Ensuring environmental insights are effectively translated into actionable knowledge.

We present a methodology that leverages data storytelling techniques to enhance the accessibility and impact of environmental digital twins, ultimately improving their effectiveness in environmental monitoring, decision-making, and stakeholder engagement.

 

How to cite: EL outa, F. and Dechambenoit, G.: What if Data story telling was the corner stone for environmental digital twins?, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-3702, https://doi.org/10.5194/egusphere-egu25-3702, 2025.