ITS2.7/AS5.2

EDI
Machine Learning for Climate Science

Recent developments in machine learning (ML) are transforming Earth observation data analysis and modelling of the Earth system and its constituent processes. While statistical models have been used for a long time, state-of-the-art machine and deep learning algorithms allow encoding non-linear, spatio-temporal relationships robustly without sacrificing interpretability. These advances have the potential to accelerate climate science by improving our understanding of the underlying processes, reducing and better quantifying uncertainty, and even making predictions directly from observations across different spatio-temporal scales.

This session aims to provide a venue to present the latest progress in the use of ML applied to all aspects of climate science including, but not limited to:
- Causal discovery and inference
- Learning (causal) process and feature representations in observations
- Hybrid models (physically informed ML)
- Novel detection and attribution approaches
- Probabilistic modelling and uncertainty quantification
- Explainable AI applications to climate science

Please consider submitting abstracts focussed on ML for model improvement, particularly for near-term (including seasonal) forecasting to the companion “ML for Earth System modelling” session.

Co-organized by CL5.3/ESSI1/NP4
Convener: Duncan Watson-ParrisECSECS | Co-conveners: Katarzyna (Kasia) TokarskaECSECS, Gustau Camps-Valls, Marlene KretschmerECSECS, Rochelle Schneider
Presentations
| Mon, 23 May, 08:30–11:50 (CEST), 13:20–14:50 (CEST), 15:10–16:40 (CEST)
 
Room N1

Presentations: Mon, 23 May | Room N1

Chairpersons: Duncan Watson-Parris, Katarzyna (Kasia) Tokarska
08:30–08:35
08:35–08:40
|
EGU22-1065
|
ECS
|
Highlight
|
On-site presentation
Sem Vijverberg, Dim Coumou, and Raed Hamed

Soy harvest failure events can severely impact farmers, insurance companies and raise global prices. Reliable seasonal forecasts of mis-harvests would allow stakeholders to prepare and take appropriate early action. However, especially for farmers, the reliability and lead-time of current prediction systems provide insufficient information to justify for within-season adaptation measures. Recent innovations increased our ability to generate reliable statistical seasonal forecasts. Here, we combine these innovations to predict the 1-3 poor soy harvest years in eastern US. We first use a clustering algorithm to spatially aggregate crop producing regions within the eastern US that are particularly sensitive to hot-dry weather conditions. Next, we use observational climate variables (sea surface temperature (SST) and soil moisture) to extract precursor timeseries at multiple lags. This allows the machine learning model to learn the low-frequency evolution, which carries important information for predictability. A selection based on causal inference allows for physically interpretable precursors. We show that the robust selected predictors are associated with the evolution of the horseshoe Pacific SST pattern, in line with previous research. We use the state of the horseshoe Pacific to identify years with enhanced predictability. We achieve very high forecast skill of poor harvests events, even 3 months prior to sowing, using a strict one-step-ahead train-test splitting. Over the last 25 years, 90% of the predicted events in February were correct. When operational, this forecast would enable farmers (and insurance/trading companies) to make informed decisions on adaption measures, e.g., selecting more drought-resistant cultivars, invest in insurance, change planting management.

How to cite: Vijverberg, S., Coumou, D., and Hamed, R.: Skilful US Soy-yield forecasts at pre-sowing lead-times, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-1065, https://doi.org/10.5194/egusphere-egu22-1065, 2022.

08:40–08:45
|
EGU22-11111
|
ECS
|
On-site presentation
Samuel Upton, Ana Bastos, Fabian Gans, Basil Kraft, Wouter Peters, Jacob Nelson, Sophia Walther, Martin Jung, and Markus Reichstein

Accurate estimates and predictions of the global carbon fluxes are critical for our understanding of the global carbon cycle and climate change. Reducing the uncertainty of the terrestrial carbon sink and closing the budget imbalance between sources and sinks would improve our ability to accurately project future climate change. Net Ecosystem Exchange (NEE), the net flux of biogenic carbon from the land surface to the atmosphere, is only directly measured at a sparse set of globally distributed eddy-covariance measurement sites. To estimate the terrestrial carbon flux at the regional and global scale, a global gridded estimate of NEE must be accurately upscaled from a model trained at the ecosystem level. In this study, the Fluxcom system* is used to train a site-level model on remotely-sensed and meteorological variables derived from site measurements, MODIS and ECMWF ERA5 atmospheric reanalysis data. The non-representative distribution of these site-level data along with missing disturbance histories impart known biases to current upscaling efforts. Observations of atmospheric carbon may provide important additional information, improving the accuracy of the upscaled flux estimate. 

This study adds an atmospheric observational operator to the model training process that connects the ecosystem-level flux model to top-down observations of atmospheric carbon by adding an additional term to the objective function. The target data are regionally integrated fluxes from an ensemble of atmospheric inversions corrected for fossil-fuel emissions and lateral fluxes.  Calculating the regionally integrated flux estimate at each training step is computationally infeasible. Our hypothesis is that the regional flux can be modeled with a limited set of points and that this sparse model preserves sufficient information about the phenomena to act as a constraint for the underlying ecosystem-level model, improving regional and global upscaled products.  Experimental results show improvements in the machine learning based regional estimates of NEE while preserving features such as the seasonal variability in the estimated flux.

 

*Jung, Martin, Christopher Schwalm, Mirco Migliavacca, Sophia Walther, Gustau Camps-Valls, Sujan Koirala, Peter Anthoni, et al. 2020. “Scaling Carbon Fluxes from Eddy Covariance Sites to Globe: Synthesis and Evaluation of the FLUXCOM Approach.” Biogeosciences 17 (5): 1343–65. 

 

How to cite: Upton, S., Bastos, A., Gans, F., Kraft, B., Peters, W., Nelson, J., Walther, S., Jung, M., and Reichstein, M.: Machine learning based estimation of regional Net Ecosystem Exchange (NEE) constrained by atmospheric inversions and ecosystem observations, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-11111, https://doi.org/10.5194/egusphere-egu22-11111, 2022.

08:45–08:50
|
EGU22-12822
|
ECS
|
Virtual presentation
Catharina Elisabeth Graafland and Jose Manuel Gutiérrez Gutiérrez

Probabilistic network models (PNMs) are well established data-driven modeling and machine learning prediction techniques used in many disciplines, including climate analysis. These techniques can efficiently learn the underlying (spatial) dependency structure and a consistent probabilistic model from data (e.g. gridded reanalysis or GCM outputs for particular variables; near surface temperature in this work), thus constituting a truly probabilistic backbone of the system underlying the data. The complex structure of the dataset is encoded using both pairwise and conditional dependencies and can be explored and characterized using network and probabilistic metrics. When applied to climate data, it is shown that Bayesian networks faithfully reveal the various long‐range teleconnections relevant in the dataset, in particular those emerging in el niño periods (Graafland, 2020).

 

In this work we apply probabilistic Gaussian networks to extract and characterize most essential spatial dependencies of the simulations generated by the different GCMs contributing to CMIP5 and 6 (Eyring 2016). In particular we analyze the problem of model interdependency (Boe, 2018) which poses practical problems for the application of these multi-model simulations in practical applications (it is often not clear what exactly makes one model different from or similar to another model).  We show that probabilistic Gaussian networks provide a promising tool to characterize the spatial structure of GCMs using simple metrics which can be used to analyze how and where differences in dependency structures are manifested. The probabilistic distance measure allows to chart CMIP5 and CMIP6 models on their closeness to reanalysis datasets that rely on observations. The measures also identifies significant atmospheric model changes that underwent CMIP5 GCMs in their transition to CMIP6. 

 

References:

 

Boé, J. Interdependency in Multimodel Climate Projections: Component Replication and Result Similarity. Geophys. Res. Lett. 45, 2771–2779, DOI: 10.1002/2017GL076829 (2018).

 

Eyring, V. et al. Overview of the Coupled Model Intercomparison Project Phase 6 (CMIP6) experimental design and organization. Geosci. Model. Dev. 9, 1937–1958, DOI: 10.5194/gmd-9-1937-2016  (2016).

 

Graafland, C.E., Gutiérrez, J.M., López, J.M. et al. The probabilistic backbone of data-driven complex networks: an example in climate. Sci Rep 10, 11484 (2020). DOI: 10.1038/s41598-020-67970-y



Acknowledgement

 

The authors would like to acknowledge project ATLAS (PID2019-111481RB-I00) funded by MCIN/AEI (doi:10.13039/501100011033). We also acknowledge support from Universidad de Cantabria and Consejería de Universidades, Igualdad, Cultura y Deporte del Gobierno de Cantabria via the “instrumentación y ciencia de datos para sondear la naturaleza del universo” project for funding this work. L.G. acknowledges support from the Spanish Agencia Estatal de Investigación through the Unidad de Excelencia María de Maeztu with reference MDM-2017-0765.



How to cite: Graafland, C. E. and Gutiérrez, J. M. G.: Assessing model dependency in CMIP5 and CMIP6 based on their spatial dependency structure with probabilistic network models, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-12822, https://doi.org/10.5194/egusphere-egu22-12822, 2022.

08:50–08:55
|
EGU22-1835
|
ECS
|
On-site presentation
Timothy Higgins, Aneesh Subramanian, Andre Graubner, Lukas Kapp-Schwoerer, Karthik Kashinath, Sol Kim, Peter Watson, Will Chapman, and Luca Delle Monache

Atmospheric rivers (ARs) are elongated corridors of water vapor in the lower Troposphere that cause extreme precipitation over many coastal regions around the globe. They play a vital role in the water cycle in the western US, fueling most extreme west coast precipitation and sometimes accounting for more than 50% of total annual west coast precipitation (Gershunov et al. 2017). Severe ARs are associated with extreme flooding and damages while weak ARs are typically more beneficial to our society as they bring much needed drought relief.

Precipitation is particularly difficult to predict in traditional climate models.  Predicting water vapor is more reliable (Lavers et al. 2016), allowing IVT (integrated vapor transport) and ARs to be a favorable method for understanding changing patterns in precipitation (Johnson et al. 2009).  There are a variety of different algorithms used to track ARs due to their relatively diverse definitions (Shields et al. 2018). The Atmospheric River Tracking Intercomparison Project (ARTMIP) organizes and provides information on all of the widely accepted algorithms that exist. Nearly all of the algorithms included in ARTMIP rely on absolute and relative numerical thresholds, which can often be computationally expensive and have a large memory footprint. This can be particularly problematic in large climate datasets. The vast majority of algorithms also heavily factor in wind velocity at multiple vertical levels to track ARs, which is especially difficult to store in climate models and is typically not output at the temporal resolution that ARs occur.

A recent alternative way of tracking ARs is through the use of machine learning. There are a variety of neural networks that are commonly applied towards identifying objects in cityscapes via semantic segmentation. The first of these neural networks that was applied towards detecting ARs is DeepLabv3+ (Prabhat et al. 2020). DeepLabv3+ is a state of the art model that demonstrates one of the highest performances of any present day neural network when tasked with the objective of identifying objects in cityscapes (Wu et al. 2019). We employ a light-weight convolutional neural network adapted from CGNet (Kapp-Schwoerer et al. 2020) to efficiently track these severe events without using wind velocity at all vertical levels as a predictor variable. When applied to cityscapes, CGNet's greatest advantage is its performance relative to its memory footprint (Wu et al. 2019). It has two orders of magnitude less parameters than DeepLabv3+ and is computationally less expensive. This can be especially useful when identifying ARs in large datasets. Convolutional neural networks have not been used to track ARs in a regional domain. This will also be the first study to demonstrate the performance of this neural network on a regional domain by providing an objective analysis of its consistency with eight different ARTMIP algorithms.

How to cite: Higgins, T., Subramanian, A., Graubner, A., Kapp-Schwoerer, L., Kashinath, K., Kim, S., Watson, P., Chapman, W., and Delle Monache, L.: Using Deep Learning for a High-Precision Analysis of Atmospheric Rivers in a High-Resolution Large Ensemble Climate Dataset, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-1835, https://doi.org/10.5194/egusphere-egu22-1835, 2022.

08:55–09:00
|
EGU22-2012
|
ECS
|
On-site presentation
Benoît Loucheur, Pierre-Antoine Absil, and Michel Journée

Quality control of meteorological data is an important part of atmospheric analysis and prediction, as missing or erroneous data can have a negative impact on the accuracy of these environmental products.

In Belgium, the Royal Meteorological Institute (RMI) is the national meteorological service that provide weather and climate services based on observations and scientific research. RMI collects and archives meteorological observations in Belgium since the 19th century. Currently, air temperature is monitored in Belgium in about 30 synoptic automatic weather stations (AWS) as well as in 110 manual climatological stations. In the latter stations, a volunteer observer records every morning at 8 o'clock the daily extreme air temperatures. All observations are routinely checked for errors, inconsistencies and missing values by the RMI staff. Misleading data are corrected and gaps are filled by estimations. This quality control tasks require a lot of human intervention. With the forthcoming deployment of low-cost weather stations and the subsequent increase in the volume of data to verify, the process of data quality control and completion should become as automated as much as possible.

In this work, the quality control process is fully automated by using mathematical tools. We present low-rank matrix completion methods (LRMC) that we used to solve the problem of completing missing data in daily minimum and maximum temperature series. We used a machine learning technique called Monte Carlo cross-validation to train our algorithms and then test them in a real case.

Among the matrix completion methods, some are regularised by graphs. In our case, it is then possible to represent the spatial and temporal component via graphs. By manipulating the construction of these graphs, we hope to improve the completion results. We were then able to compare our methods with what is done in the state of the art, such as the inverse distance weighting (IDW) method.

All our experiments were performed with a dataset provided by the RMI, including daily minimum and maximum temperature measurements from 100 stations over the period 2005-2019.

How to cite: Loucheur, B., Absil, P.-A., and Journée, M.: Gap filling in air temperature series by matrix completion methods, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-2012, https://doi.org/10.5194/egusphere-egu22-2012, 2022.

09:00–09:05
|
EGU22-3482
|
ECS
|
Virtual presentation
SungKu Heo, Pouya Ifaei, Mohammad Moosazadeh, and ChangKyoo Yoo

Climate change is a global crisis to the world which influences the human race and society's development. Threatens of climate change have become increasingly recognized to the public and government in both environments, society, and economy across the globe; because the consequence of climate change is not only shown up as the increasing of global temperature, also expressed in an intensive natural hazard, such as floods, droughts, wildfires, and hurricanes. For the sustainability development in the globe, it is crucial to provide a response to mitigating climate change through the government’s policy and decision-making; however, the public's engagement in the actions towards the critical environmental crisis still needs to be largely promoted. Analyzing the relationship between the public awareness of climate change and natural disasters is an essential aspect in climate change mitigation and policymaking. In this study, based on the abundance of the text message in social media, especially Twitter, the public understanding and discussions upon climate change from the surrounding environment was recognized and analyzed through the human as the sensor which receiving information of climate change. Twitter content analysis and filed data impact analysis were conducted; text mining algorithms are implemented in the Twitter big-data information to find the similarity based on a cosine similarity score (CSS) between the climate change corpus and the natural events corpora. Then, the factors of nature disaster influence were predicted utilizing a multiple linear regression model and climate change tweets dataset. This research shows that the public is more pretend to link the natural events with climate change when they tweeting when serious natural disasters happened. The developed regression model indicated that natural events caused by the consequence of climate change influenced the people’s social media activity through messages on Twitter with having the awareness of climate change. From this study, the results indicated that the public experience of natural events including intensive disasters can lead them to link the climate change with the natural events easily; when compared with the people who rarely experience natural events.

Acknowledgment

This research was supported by the project (NRF-2021R1A2C2007838) through the National Research Foundation of Korea (NRF) and the Korea Ministry of Environment (MOE) as Graduate school specialized in Climate Change.

How to cite: Heo, S., Ifaei, P., Moosazadeh, M., and Yoo, C.: Public perception assessment on climate change and natural disaster influence using social media big-data: A case study of USA, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-3482, https://doi.org/10.5194/egusphere-egu22-3482, 2022.

09:05–09:10
|
EGU22-6543
|
ECS
|
Virtual presentation
Nikolaos Nikolaou, Laurens Bouwer, Mahyar Valizadeh, Marco Dallavalle, Kathrin Wolf, Massimo Stafoggia, Annette Peters, and Alexandra Schneider

Introduction: Relative humidity (RH) is a meteorological variable of great importance as it affects other climatic variables and plays a role in plant and animal life as well as in human comfort and well-being. However, the commonly used weather station observations are inefficient to represent the great spatiotemporal RH variability, leading to exposure misclassification and difficulties to assess local RH health effects. There is also a lack of high-resolution RH spatial datasets and no readily available methods for modeling humidity across space and time. To tackle these issues, we aimed to improve the spatiotemporal coverage of RH data in Germany, using remote sensing and machine learning (ML) modeling.

Methods: In this study, we estimated German-wide daily mean RH at 1km2 resolution over the period 2000-2020. We used several predictors from multiple sources, including DWD RH observations, Ta predictions as well as satellite-derived DEM, NDVI and the True Color band composition (bands 1, 4 and 3: red, green and blue). Our main predictor for estimating the daily mean RH was the daily mean Ta. We had already mapped daily mean Ta in 1km2 across Germany through a regression-based hybrid approach of two linear mixed models using land surface temperature. Additionally, a very important predictor was the date, capturing the day-to-day variation of the response-explanatory variables relationship. All these variables were included in a Random Forest (RF) model, applied for each year separately. We assessed the model’s accuracy via 10-fold cross-validation (CV). First internally, using station observations that were not used for the model training, and then externally in the Augsburg metropolitan area using the REKLIM monitoring network over the period 2015-2019.

Results: Regarding the internal validation, the 21-year overall mean CV-R2 was 0.76 and the CV-RMSE was 6.084%. For the model’s external performance, at the same day, we found CV-R2=0.75 and CV-RMSE=7.051% and for the 7-day average, CV-R2=0.81 and CV-RMSE=5.420%. Germany is characterized by high relative humidity values, having a 20-year average RH of 78.4%. Even if the annual country-wide averages were quite stable, ranging from 81.2% for 2001 to 75.3% for 2020, the spatial variability exceeded 15% annually on average. Generally, winter was the most humid period and especially December was the most humid month. Extended urban cores (e.g., from Stuttgart to Frankfurt) or individual cities as Munich were less humid than the surrounding rural areas. There are also specific spatial patterns for RH distribution, including mountains, rivers and coastlines. For instance, the Alps and the North Sea coast are areas with elevated RH.

Conclusion: Our results indicate that the applied hybrid RF model is suitable for estimating nationwide RH at high spatiotemporal resolution, achieving a strong performance with low errors. Our method contributes to an improved spatial estimation of RH and the output product will help us understand better the spatiotemporal patterns of RH in Germany. We also plan to apply other ML techniques and compare the findings. Finally, our dataset will be used for epidemiological analyses, but could also be used for other research questions.

How to cite: Nikolaou, N., Bouwer, L., Valizadeh, M., Dallavalle, M., Wolf, K., Stafoggia, M., Peters, A., and Schneider, A.: High-resolution hybrid spatiotemporal modeling of daily relative humidity across Germany for epidemiological research: a Random Forest approach, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-6543, https://doi.org/10.5194/egusphere-egu22-6543, 2022.

09:10–09:15
|
EGU22-7011
|
On-site presentation
Tobias Weigel, Frauke Albrecht, Caroline Arnold, Danu Caus, Harsh Grover, and Andrey Vlasenko

This presentation reports on support done under the aegis of Helmholtz AI for a wide range of machine learning based solutions for research questions related to Earth and Environmental sciences. We will give insight into typical problem statements from Earth observation and Earth system modeling that are good candidates for experimentation with ML methods and report on our accumulated experience tackling such challenges with individual support projects. We address these projects in an agile, iterative manner and during the definition phase, we direct special attention towards assembling practically meaningful demonstrators within a couple of months. A recent focus of our work lies on tackling software engineering concerns for building ML-ESM hybrids.

Our implementation workflow covers stages from data exploration to model tuning. A project may often start with evaluating available data and deciding on basic feasibility, apparent limitations such as biases or a lack of labels, and splitting into training and test data. Setting up a data processing workflow to subselect and compile training data is often the next step, followed by setting up a model architecture. We have made good experience with automatic tooling to tune hyperparameters and test and optimize network architectures. In typical implementation projects, these stages may repeat many times to improve results and cover aspects such as errors due to confusing samples, incorporating domain model knowledge, testing alternative architectures and ML approaches, and dealing with memory limitations and performance optimization.

Over the past two years, we have supported Helmholtz-based researchers from many subdisciplines on making the best use of ML methods along with these steps. Example projects include wind speed regression on GNSS-R data, emulation of atmospheric chemistry modeling, Earth System model parameterizations with ML, marine litter detection, and rogue waves prediction. The poster presentation will highlight selected best practices across these projects. We are happy to share our experience as it may prove useful to applications in wider Earth System modeling. If you are interested in discussing your challenge with us, please feel free to chat with us.

How to cite: Weigel, T., Albrecht, F., Arnold, C., Caus, D., Grover, H., and Vlasenko, A.: Unlocking the potential of ML for Earth and Environment researchers, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-7011, https://doi.org/10.5194/egusphere-egu22-7011, 2022.

09:15–09:20
|
EGU22-7034
|
ECS
|
On-site presentation
Lucile Ricard, Athanasios Nenes, Jakob Runge, and Fabrizio Falasca

Climate sensitivity expresses how average global temperature responds to an increase in greenhouse gas concentration. It is a key metric to assess climate change, and to formulate policy decisions, but its estimation from the Earth System Models (ESM) provides a wide range: between 2.5 and 4.0 K based on the sixth assessment report (AR6) of the Intergovernmental Panel on Climate Change (IPCC). To narrow down this spread, a number of observable metrics, called “emergent constraints” have been proposed, but often are based on relatively few parameters from a simulation – thought to express the “essence” of the climate simulation and its relationship with climate sensitivity. Many of the constraints to date however are model-dependent, therefore questionable in terms of their robustness.

We postulate that methods based on “holistic” consideration of the simulations and observations may provide more robust constraints; we also focus on Sea Surface Temperature (SST) ensembles as SST is a major driver of climate variability. To extract the essential patterns of SST variability, we use a knowledge discovery and network inference method, δ-Maps (Fountalis et al., 2016, Falasca et al, 2019), expanded to include a causal discovery algorithm (PCMCI) that relies on conditional independence testing, to capture the essential dynamics of the climate simulation on a functional graph and explore the true causal effects of the underlying dynamical system (Runge et al., 2019). The resulting networks are then quantitatively compared using network “metrics” that capture different aspects, including the regions of uniform behavior, how they alternate over time and the strength of association. These metrics are then compared between simulations, and observations and used as emergent constraints, called Causal Model Evaluation (CME).

We apply δ-Maps and CME to CMIP6 model SST outputs and demonstrate how the networks and related metrics can be used to assess the historical performance of CMIP models, and climate sensitivity. We start by comparing the CMIP6 simulations against CMIP5 models, by using the reanalysis dataset HadISST (Met Office Hadley Centre) as a proxy for observations. Each field is reduced to a network and then how similar they are with reanalysis SST. The CMIP6 historical networks are then compared against CMIP6 projected networks, build from the Shared Socio-Economic Pathway ssp245 (“Middle of the road”) scenario. Comparing past and future SST networks help us to evaluate the extent to which climate warming is encompassed in the change overlying dynamical system of our networks. A large distance from network build over the past period to network build over a future scenario could be tightly related to a large temperature response to an increase of greenhouse gas emission, that is the way we define climate sensitivity. We finally give a new estimation of the climate sensitivity with a weighting scheme approach, derived from a combination of its performance metrics.

How to cite: Ricard, L., Nenes, A., Runge, J., and Falasca, F.: Developing a new emergent constraint through network analysis, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-7034, https://doi.org/10.5194/egusphere-egu22-7034, 2022.

09:20–09:25
|
EGU22-8499
|
ECS
|
Highlight
|
On-site presentation
Björn Lütjens, Catherine H. Crawford, Campbell Watson, Chris Hill, and Dava Newman

Running a high-resolution global climate model can take multiple days on the world's largest supercomputers. Due to the long runtimes that are caused by solving the underlying partial differential equations (PDEs), climate researchers struggle to generate ensemble runs that are necessary for uncertainty quantification or exploring climate policy decisions.

 

Physics-informed neural networks (PINNs) promise a solution: they can solve single instances of PDEs up to three orders of magnitude faster than traditional finite difference numerical solvers. However, most approaches in physics-informed machine learning learn the solution of PDEs over the full spatio-temporal domain, which requires infeasible amounts of training data, does not exploit knowledge of the underlying large-scale physics, and reduces model trust. Our philosophy is to limit learning to the hard-to-model parts. Hence, we are proposing a novel method called \textit{matryoshka neural operator} that leverages an old scheme called super-parametrizations developed in geophysical fluid dynamics. Using this scheme our proposed physics-informed architecture exploits knowledge of approximate large-scale dynamics and only learns the influence of small-scale dynamics onto large-scale dynamics, also called subgrid parametrizations.

 

Some work in geophysical fluid dynamics is conceptually similar, but fully relies on neural networks which can only operate on fixed grids (Gentine et al., 2018). We are the first to learn grid-independent subgrid parametrizations by leveraging neural operators that learn the dynamics in a grid-independent latent space. Neural operators can be seen as an extension of neural networks to infinite-dimensions: They encode infinite-dimensional inputs into a finite-dimensional representations, such as Eigen or Fourier modes, and learn the nonlinear temporal dynamics in the encoded state.

 

We demonstrate the neural operators for learning non-local subgrid parametrizations over the full large-scale domain of the two-scale Lorenz96 equation. We show that the proposed learning-based PDE solver is grid-independent, has quasilinear instead of quadratic complexity in comparison to a fully-resolving numerical solver, is more accurate than current neural network or polynomial-based parametrizations, and offers interpretability through Fourier modes.

 

Gentine, P., Pritchard, M., Rasp, S., Reinaudi, G., and Yacalis, G. (2018). Could machine learning break the convection parameterization deadlock? Geophysical Research Letters, 45, 5742– 5751. https://doi.org/10.1029/2018GL078202

How to cite: Lütjens, B., Crawford, C. H., Watson, C., Hill, C., and Newman, D.: Matryoshka Neural Operators: Learning Fast PDE Solvers for Multiscale Physics, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-8499, https://doi.org/10.5194/egusphere-egu22-8499, 2022.

09:25–09:30
|
EGU22-8867
|
Presentation form not yet defined
Artificial intelligence reconstructs global temperature from scarce local data
(withdrawn)
Martin Wegmann and Fernando Jaume Santero
09:30–09:35
|
EGU22-9461
|
Presentation form not yet defined
Balasubramanya Nadiga

Reduced-order dynamical models play a central role in developing our understanding of predictability of climate. In this context, the Linear Inverse Modeling (LIM) approach (closely related to Dynamic Mode Decomposition DMD), by helping capture a few essential interactions between dynamical components of the full system, has proven valuable in being able to give insights into the dynamical behavior of the full system. While nonlinear extensions of the LIM approach have been attempted none have gained widespread acceptance. We demonstrate that Reservoir Computing (RC), a form of machine learning suited for learning in the context of chaotic dynamics, by exploiting the phenomenon of generalized synchronization, provides an alternative nonlinear approach that comprehensively outperforms the LIM approach.  Additionally, the potential of the RC approach to capture the structure of the climatological attractor and to continue the evolution of the system on the attractor in a realistic fashion long after the ensemble average has stopped tracking the reference trajectory is highlighted. Finally, other dynamical systems based methods and probabilistic deep learning methods are considered and a broader perspective on the use of data-driven methods in understanding climate predictability is offered

How to cite: Nadiga, B.: Data Driven Approaches for Climate Predictability, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-9461, https://doi.org/10.5194/egusphere-egu22-9461, 2022.

09:35–09:40
|
EGU22-4534
|
ECS
|
Presentation form not yet defined
Joana Roussillon, Jean Littaye, Ronan Fablet, Lucas Drumetz, Thomas Gorgues, and Elodie Martinez

Phytoplankton plays a key role in the carbon cycle and fuels marine food webs. Its seasonal and interannual variations are relatively well-known at global scale thanks to satellite ocean color observations that have been continuously acquired since 1997. However, the satellite-derived chlorophyll-a concentrations (Chl-a, a proxy of phytoplankton biomass) time series are still too short to investigate phytoplankton biomass low-frequency variability. Machine learning models such as support vector regression (SVR) or multi-layer perceptron (MLP) have recently proven to be an alternative approach to mechanistic ones to reconstruct Chl-a past signals (including periods before the satellite era) from physical predictors, but they remain unsatisfactory. In particular, the relationships between phytoplankton and its physical surrounding environment are not homogeneous in space, and training such models over the entire globe does not allow them to capture these regional specificities. Moreover, if the global ocean is commonly partitioned into biogeochemical provinces into which phytoplankton growth is supposed to be governed by similar processes, their time-evolving nature makes it difficult to impose a priori spatial constraints to restrict the learning phase on specific areas. Here, we propose to overcome this limitation by introducing spatial multi-modalities into a convolutional neural network (CNN). The latter can learn with no particular supervision several spatially weighted modes of variability. Each of them is associated with a CNN submodel trained in parallel, standing for a mode-specific response of phytoplankton biomass to the physical forcing. Beyond improving performance reconstruction, we will show that the learned spatial modes appear physically consistent and may help to get new insights into physical-biogeochemical processes controlling phytoplankton repartition at global scale.

How to cite: Roussillon, J., Littaye, J., Fablet, R., Drumetz, L., Gorgues, T., and Martinez, E.: Spatial multi-modality as a way to improve both performance and interpretability of deep learning models to reconstruct phytoplankton time-series in the global ocean, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-4534, https://doi.org/10.5194/egusphere-egu22-4534, 2022.

09:40–09:50
|
EGU22-722
|
ECS
|
solicited
|
Highlight
|
Presentation form not yet defined
Tom Beucler, Fernando Iglesias-Suarez, Veronika Eyring, Michael Pritchard, Jakob Runge, and Pierre Gentine

Data-driven algorithms, in particular neural networks, can emulate the effects of unresolved processes in coarse-resolution Earth system models (ESMs) if trained on high-resolution simulation or observational data. However, they can (1) make large generalization errors when evaluated in conditions they were not trained on; and (2) trigger instabilities when coupled back to ESMs.

First, we propose to physically rescale the inputs and outputs of neural networks to help them generalize to unseen climates. Applied to the offline parameterization of subgrid-scale thermodynamics (convection and radiation) in three distinct climate models, we show that rescaled or "climate-invariant" neural networks make accurate predictions in test climates that are 8K warmer than their training climates. Second, we propose to eliminate spurious causal relations between inputs and outputs by using a recently developed causal discovery framework (PCMCI). For each output, we run PCMCI on the inputs time series to identify the reduced set of inputs that have the strongest causal relationship with the output. Preliminary results show that we can reach similar levels of accuracy by training one neural network per output with the reduced set of inputs; stability implications when coupled back to the ESM are explored.

Overall, our results suggest that explicitly incorporating physical knowledge into data-driven models of Earth system processes may improve their ability to generalize across climate regimes, while quantifying causal associations to select the optimal set of inputs may improve their consistency and stability.

How to cite: Beucler, T., Iglesias-Suarez, F., Eyring, V., Pritchard, M., Runge, J., and Gentine, P.: Climate-Invariant, Causally Consistent Neural Networks as Robust Emulators of Subgrid Processes across Climates, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-722, https://doi.org/10.5194/egusphere-egu22-722, 2022.

09:50–10:00
Coffee break
Chairpersons: Katarzyna (Kasia) Tokarska, Duncan Watson-Parris
10:20–10:25
10:25–10:30
|
EGU22-12720
|
Highlight
|
Virtual presentation
Hannah Christensen and Antoine Delaunay

The Madden–Julian Oscillation (MJO) is the dominant source of sub-seasonal variability in the tropics. It consists of an Eastward moving region of enhanced convection coupled to changes in zonal winds. It is not possible to predict the precise evolution of the MJO, so subseasonal forecasts are generally probabilistic. Ideally the spread of the forecast probability distribution would vary from day to day depending on the instantaneous predictability of the MJO. Operational subseasonal forecasting models do not have this property. We present a deep convolutional neural network that produces skilful state-dependent probabilistic MJO forecasts. This statistical model accounts for intrinsic chaotic uncertainty by predicting the standard deviation about the mean, and model uncertainty using a Monte-Carlo dropout approach. Interpretation of the mean forecasts from the neural network highlights known MJO mechanisms, providing confidence in the model, while interpretation of the predicted uncertainty indicates new physical mechanisms governing MJO predictability.

How to cite: Christensen, H. and Delaunay, A.: Interpretable Deep Learning for Probabilistic MJO Prediction, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-12720, https://doi.org/10.5194/egusphere-egu22-12720, 2022.

10:30–10:35
|
EGU22-10888
|
ECS
|
On-site presentation
Na-Yeon Shin, Yoo-Geun Ham, Jeong-Hwan Kim, Minsu Cho, and Jong-Seong Kug

Many deep learning technologies have been applied to the Earth sciences, including weather forecast, climate prediction, parameterization, resolution improvements, etc. Nonetheless, the difficulty in interpreting deep learning results still prevents their applications to studies on climate dynamics. Here, we applied a convolutional neural network to understand El Niño–Southern Oscillation (ENSO) dynamics from long-term climate model simulations. The deep learning algorithm successfully predicted ENSO events with a high correlation skill of 0.82 for a 9-month lead. For interpreting deep learning results beyond the prediction skill, we first developed a “contribution map,” which estimates how much each grid point and variable contribute to a final output variable. Furthermore, we introduced a “sensitivity,” which estimates how much the output variable is sensitively changed to the small perturbation of the input variables by showing the differences in the output variables. The contribution map clearly shows the most important precursors for El Niño and La Niña developments. In addition, the sensitivity clearly reveals nonlinear relations between the precursors and the ENSO index, which helps us understand the respective role of each precursor. Our results suggest that the contribution map and sensitivity would be beneficial for understanding other climate phenomena.

How to cite: Shin, N.-Y., Ham, Y.-G., Kim, J.-H., Cho, M., and Kug, J.-S.: How to utilize deep learning to understand climate dynamics? : An ENSO example., EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-10888, https://doi.org/10.5194/egusphere-egu22-10888, 2022.

10:35–10:40
|
EGU22-12858
|
ECS
|
On-site presentation
Mohit Anand, Gustau Camps-Valls, and Jakob Zscheischler

Forests form one of the major components of the carbon cycle and take up large amounts of carbon dioxide from the atmosphere, thereby slowing down the rate of climate change. Carbon uptake by forests is a highly complex process strongly controlled by meteorological forcing, mainly because of two reasons. First, forests have a large storage capacity acting as a buffer to short-duration changes in meteorological drivers. The response can thus be very complex and extend over a long time. Secondly, the responses are often triggered by combinations of multiple compounding drivers including precipitation, temperature and solar radiation. Effects may compound between variables and across time. Therefore, a large amount of data is required to identify the complex drivers of adverse forest response to climate forcing. Recent advances in machine learning offer a suite of promising tools to analyse large amounts of data and address the challenge of identifying complex drivers of impacts. Here we analyse the potential of machine learning to identify the compounding drivers of reduced carbon uptake/forest mortality. To this end, we generate 200,000 years of gross and net carbon uptake from the physically-based forest model FORMIND simulating a beech forest in Germany. The climate data is generated through a weather generator (AWEGEN-1D) from bias-corrected ERA5 reanalysis data.  Classical machine learning models like random forest, support vector machines and deep neural networks are trained to estimate gross primary product. Deep learning models involving convolutional layers are found to perform better than the other classical machine learning models. Initial results show that at least three years of weather data are required to predict annual carbon uptake with high accuracy, highlighting the complex lagged effects that characterize forests. We assess the performance of the different models and discuss their interpretability regarding the identification of impact drivers.



How to cite: Anand, M., Camps-Valls, G., and Zscheischler, J.: Identifying drivers of extreme reductions in carbon uptake of forests with interpretable machine learning, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-12858, https://doi.org/10.5194/egusphere-egu22-12858, 2022.

10:40–10:45
|
EGU22-2248
|
ECS
|
On-site presentation
Shijie Jiang, Yi Zheng, and Jakob Zscheischler

Understanding the mechanisms causing river flooding and their trends is important to interpret past flood changes and make better predictions of future flood conditions. However,  there is still a lack of quantitative assessment of trends in flooding mechanisms based on observations. Recent years have witnessed the increasing prevalence of machine learning in hydrological modeling and its predictive power has been demonstrated in numerous studies. Machine learning makes hydrological predictions by recognizing generalizable relationships between inputs and outputs, which, if properly interpreted, may provide us further scientific insights into hydrological processes. In this study, we propose a new method using interpretive machine learning to identify flooding mechanisms based on the predictive relationship between precipitation and temperature and flow peaks. Applying this method to more than a thousand catchments in Europe reveals three primary input-output patterns within flow predictions, which can be associated with three catchment-wide flooding mechanisms: extreme precipitation, soil moisture excess, and snowmelt. The results indicate that approximately one-third of the studied catchments are controlled by a combination of the above mechanisms, while others are mostly dominated by one single mechanism. Although no significant shifts from one dominant mechanism to another are observed for the catchments over the past seven decades overall, some catchments with single mechanisms have become dominated by mixed mechanisms and vice versa. In particular, snowmelt-induced floods have decreased significantly in general, whereas rainfall has become more dominant in causing floods, and their effects on flooding seasonality and magnitude are crucial. ​Overall, this study provides a new perspective for understanding climatic extremes and demonstrates the prospect of artificial intelligence(AI)-assisted scientific discovery in the future.

How to cite: Jiang, S., Zheng, Y., and Zscheischler, J.: Exploring flooding mechanisms and their trends in Europe through explainable AI, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-2248, https://doi.org/10.5194/egusphere-egu22-2248, 2022.

10:45–10:50
|
EGU22-2391
|
ECS
|
On-site presentation
Kai Jeggle, David Neubauer, Gustau Camps-Valls, Hanin Binder, Michael Sprenger, and Ulrike Lohmann

Cirrus cloud microphysics and their interactions with aerosols remain one of the largest uncertainties in global climate models and climate change projections. The uncertainty originates from the high spatio-temporal variability and their non-linear dependence on meteorological drivers like temperature, updraft velocities, and aerosol environment. We combine ten years of CALIPSO/CloudSat satellite observations of cirrus clouds with ERA5 and MERRA-2 reanalysis data of meteorological and aerosol variables to create a spatial data cube. Lagrangian back trajectories are calculated for each cirrus cloud observation to add a temporal dimension to the data cube. We then train a gradient boosted tree machine learning (ML) model to predict vertically resolved cirrus cloud microphysical properties (i.e. observed ice crystal number concentration and ice water content). The explainable machine learning method of SHAP values is applied to assess the impact of individual cirrus drivers as well as combinations of drivers on cirrus cloud microphysical properties in varying meteorological conditions. In addition, we analyze how the impact of the drivers differs regionally, vertically, and temporally.

We find that the tree-based ML model is able to create a good mapping between cirrus drivers and microphysical properties (R² ~0.75) and the SHAP value analysis provides detailed insights in how different drivers impact the prediction of the microphysical cirrus cloud properties. These findings can be used to improve global climate model parameterizations of cirrus cloud formation in future works. Our approach is a good example for exploring unsolved scientific questions using explainable machine learning and feeding back insights to the domain science.

How to cite: Jeggle, K., Neubauer, D., Camps-Valls, G., Binder, H., Sprenger, M., and Lohmann, U.: Exploring cirrus cloud microphysical properties using explainable machine learning , EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-2391, https://doi.org/10.5194/egusphere-egu22-2391, 2022.

10:50–10:55
|
EGU22-4431
|
ECS
|
Virtual presentation
Deepayan Chakraborty, Adway Mitra, Bhupendranath Goswami, and Pv Rajesh

Indian Summer Monsoon Rainfall (ISMR) is a complex phenomenon that depends on several climatic phenomena at different parts of the word through teleconnections. Each season is characterized by extended periods of wet and dry spells (which may cause floods or droughts) which contribute to intra-seasonal variability. Tropical and extra-tropical drivers jointly influence the intra-seasonal variability. Although El Nino and Southern Oscillation (ENSO) is known to be a driver of ISMR, researchers have also found its relation with Indian Ocean Dipole (IOD), North Atlantic Oscillations (NAO), Atlantic Multi-decadal Oscillation (AMO). In this work, we use ideas from Causality Theory and Explainable Machine Learning to quantify the influence of different climatic phenomena on the intraseasonal variation of ISMR.

To identify such causal relations, we applied two statistically sound causal inference approaches, i.e., PCMCI+ Algorithm (Conditional Independence based) and Granger Causal test (Regression-based).  For the Granger causality test, we have examined separately for both linear and non-linear regression. In case of PCMCI+, conditional independence tests were used between pairs of variables at different "lag periods". It is worth pointing out that, till now “causality” is not properly quantified in the Climate Science community and only linear correlations are used as a basis to identify relationships like ENSO-ISMR and AMO-ISMR. We performed experiments on mean monthly rainfall anomaly data (during the monsoon months of June-September over India) along with six probable drivers (ENSO, AMO, North Atlantic Oscillation, Pacific Decadal Oscillation, Atlantic Nino, and Indian Ocean Dipole) for May, June, July, August, September months during the period 1861-2016. While the two approaches produced some contradictions, they also produced a common conclusion that ENSO and AMO are equally important and independent drivers of ISMR. 

Additionally, we have studied the contribution of the drivers on annual extremes of ISMR (years of deficient and excess rainfall) using Shapley values based on the concept of Game Theory to quantify the contributions of different predictors in a model. In this work, we train a XGBoost model to predict the ISMR anomaly from any values of the predictor variables. The experiment is carried out in two approaches. One approach involves analyzing the contribution of each driver for each of the ISMR months of any year on the mean seasonal rainfall anomaly of that year. Another approach focuses on the contribution of the seasonal mean value of each driver on the same. In both approaches, we analyze the distribution of each driver’s Shapley values for excess and deficient monsoon years for contrast. We find that while ENSO is indeed the dominant driving factor for a majority of these years, AMO is another major factor which frequently contributes to such deficiencies, while Atlantic Nino and Indian Ocean Dipole too sometimes contribute. On the other hand, Indian Ocean Dipole seems to be a major contributor for several years of excess rainfall. As future work, we plan to carry out a robustness analysis of these results, and also examine the drivers of regional extremes.

How to cite: Chakraborty, D., Mitra, A., Goswami, B., and Rajesh, P.: Identification of Global Drivers of Indian Summer Monsoon using Causal Inference and Interpretable AI, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-4431, https://doi.org/10.5194/egusphere-egu22-4431, 2022.

10:55–11:00
|
EGU22-5464
|
ECS
|
On-site presentation
Lily-belle Sweet and Jakob Zscheischler

Extreme weather events, such as droughts, floods or heatwaves, severely impact agricultural yield. However, crop yield failure may also be caused by the temporal or multivariate compounding of more moderate weather events. An example of such an occurrence is the phenomenon of 'false spring', where the combined effects of a warm interval in late winter followed by a period of freezing temperatures can result in severe damage to vegetation. Alternatively, multiple weather events may impact crops simultaneously, as with compound hot and dry weather conditions.

Machine learning techniques are able to learn highly complex and nonlinear relationships between predictors. Such methods have previously been used to explore the influence of monthly- or seasonally-aggregated weather data as well as predefined extreme event indicators on crop yield. However, as crop yield may be impacted by climatic variables at different temporal scales, interpretable machine learning methods that can extract relevant meteorological features from higher-resolution time series data are desirable.

In this study we test the ability of adaptations of random forest models to identify compound meteorological drivers of crop failure from simulated data. In particular, adaptations of random forest models capable of ingesting daily multivariate time series data and spatial information are used. First, we train models to extract useful features from daily climatic data and predict crop yield failure probabilities. Second, we use permutation feature importances and sequential feature selection to investigate weather events and time periods identified by the models as most relevant for crop yield failure prediction. Finally, we explore the interactions learned by the models between these selected meteorological drivers, and compare the outcomes for several global crop models. Ultimately, our goal is to present a robust and highly interpretable machine learning method that can identify critical weather conditions from datasets with high temporal and spatial resolution, and is therefore able to identify drivers of crop failure using relatively few years of data.

How to cite: Sweet, L. and Zscheischler, J.: Using interpretable machine learning to identify compound meteorological drivers of crop yield failure, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-5464, https://doi.org/10.5194/egusphere-egu22-5464, 2022.

11:00–11:05
|
EGU22-6958
|
ECS
|
Virtual presentation
Andreas Gerhardus and Jakob Runge

Understanding the cause and effect relationships that govern natural phenomena is central to the scientific inquiry. While being the gold standard for inferring causal relationships, there are many scenarios in which controlled experiments are not possible. This is for example the case for most aspects of Earth's complex climate system. Causal relationships then have to be learned from statistical dependencies in observational data, a task that is commonly referred to as (observational) causal discovery.

When applied to time series data for learning causal relationships in dynamical systems, methods for causal discovery face additional statistical challenges. This is so because, as licensed by an assumption of stationarity, samples are taken in a sliding window fashion and hence autocorrelated rather than iid. Moreover, strong autocorrelations also often occlude other relevant causal links. The recent PCMCI algorithm (Runge et al., 2019) and its variants PCMCI+ (Runge, 2020) and LPCMCI (Gerhardus and Runge, 2020) address and to some extent alleviate theses issues.

In this contribution we present the Ensemble-PCMCI method, an adaption of PCMCI (and its variants PCMCI+ and LPCMCI) to cases in which the data comprises several time series, i.e., measurements of several instances of the same underlying dynamical system. Samples can then be taken from these different time series instead of a in a sliding window fashion, thus avoiding the issue of autocorrelation and also allowing to relax the stationarity assumption. In particular, this opens the possibility to analyze temporal changes in the underlying causal mechanisms. A potential domain of application are ensemble forecasts.

Related references:
Jakob Runge et al. (2019). Detecting and quantifying causal associations in large nonlinear time series datasets. Science Advances 5 eaau4996.

Jakob Runge (2020). Discovering contemporaneous and lagged causal relations in autocorrelated nonlinear time series datasets. In Proceedings of the 36th Conference on Uncertainty in Artificial Intelligence (UAI). Proceedings of Machine Learning Research 124 1388–1397. PMLR.

Andreas Gerhardus and Jakob Runge (2020). High-recall causal discovery for autocorrelated time series with latent confounders. In Advances in Neural Information Processing Systems 33 12615–12625. Curran Associates, Inc.

How to cite: Gerhardus, A. and Runge, J.: Causal Discovery in Ensembles of Climate Time Series, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-6958, https://doi.org/10.5194/egusphere-egu22-6958, 2022.

11:05–11:10
|
EGU22-8130
|
ECS
|
On-site presentation
Philine Lou Bommer, Marlene Kretschmer, Dilyara Bareeva, Kadircan Aksoy, and Marina Höhne

In climate change research we are dealing with a chaotic system, usually leading to huge computational efforts in order to make faithful predictions. Deep neural networks (DNNs) offer promising new approaches due to their computational efficiency and universal solution properties. However, despite the increase in successful application cases with DNNs, the black-box nature of such purely data-driven approaches limits their trustworthiness and therefore the useability of deep learning in the context of climate science.

The field of explainable artificial intelligence (XAI) has been established to enable a deeper understanding of the complex, highly-nonlinear methods and their predictions. By shedding light onto the reasons behind the predictions made by DNNs, XAI methods can serve as a support for researchers to reveal the underlying physical mechanisms and properties inherent in the studied data. Some XAI methods have already been successfully applied to climate science, however, no detailed comparison of their performances is available. As the number of XAI methods on the one hand, and DNN applications on the other hand are growing, a comprehensive evaluation is necessary in order to understand the different XAI methods in the climate context.

In this work we provide an overview of different available XAI methods and their potential applications for climate science. Based on a previously published climate change prediction task, we compare several explanation approaches, including model-aware (e.g. Saliency, IntGrad, LRP) and model-agnostic methods (e.g. SHAP). We analyse their ability to verify the physical soundness of the DNN predictions as well as their ability to uncover new insights into the underlying climate phenomena. Another important aspect we address in our work is the possibility to assess the underlying uncertainties of DNN predictions using XAI methods. This is especially crucial in climate science applications where uncertainty due to natural variability is usually large. To this end, we investigate the potential of two recently introduced XAI methods —UAI+ and NoiseGrad, which have been designed to include uncertainty information of the predictions into the explanations. We demonstrate that those XAI methods enable more stable explanations with respect to model noise and can further deal with uncertainties of network information. We argue that these methods are therefore particularly suitable for climate science application cases.

How to cite: Bommer, P. L., Kretschmer, M., Bareeva, D., Aksoy, K., and Höhne, M.: A comparison of explainable AI solutions to a climate change prediction task, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-8130, https://doi.org/10.5194/egusphere-egu22-8130, 2022.

11:10–11:15
|
EGU22-8411
|
ECS
|
On-site presentation
Björn Mayer, Elizabeth Barnes, Jochem Marotzke, and Johanna Baehr

Despite the importance of the Atlantic Meridional Overturning Circulation (AMOC) to the climate on decadal and multidecadal timescales, Earth System Models (ESM) exhibit large differences in their estimation of the amplitude and spectrum of its variability. In addition, observational data is sparse and before the onset of the current century, many reconstructions of the AMOC rely on linear relationships to the more readily observed surface properties of the Atlantic rather than the less explored deeper ocean. Yet, it is conceptually well established that the density distribution is dynamically closely related to the AMOC, and in this contribution, we investigate this connection in model simulations to identify which density information is necessary to reconstruct the AMOC. We chose to establish these links in a data-driven approach. 

We use simulations from a historically forced large ensemble as well as abruptly forced long term simulations with varying strength of forcing and therefore comprising vastly different states of the AMOC. In a first step, we train uncertainty-aware neural networks to infer the state of the AMOC from the density information at different layers in the North Atlantic. In a second step, we compare the performance of the trained neural networks across depth and with their linear counterparts in simulations that were not part of the training process. Finally, we investigate how the networks arrived at their specific prediction using Layer-Wise-Relevance Propagation (LRP), a recently developed technique that propagates relevance backwards through the network to the input density field, effectively filtering out important from unimportant information and identifying regions of high relevance for the reconstruction of the AMOC.

Our preliminary results show that in general, the information provided by only one density layer between the surface and 1100 m is sufficient to reconstruct the AMOC with high precision, and neural networks are capable of generalizing to unseen simulations. From the set of these neural networks trained on different layers, we choose the surface layer as well as one subsurface layer close to 1000 m for further investigation of their decision-making process using LRP. Our preliminary investigation reveals that the LRP in the subsurface layer identifies regions of potentially high physical relevance for the AMOC. By contrast, the regions identified in the surface layer show little physical relevance for the AMOC.

How to cite: Mayer, B., Barnes, E., Marotzke, J., and Baehr, J.: Reconstructing the Atlantic Meridional Overturning Circulation in Earth System Model simulations from density information using explainable machine learning, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-8411, https://doi.org/10.5194/egusphere-egu22-8411, 2022.

11:15–11:20
|
EGU22-5756
|
On-site presentation
Katerina Hlavackova-Schindler (Schindlerova), Andreas Fuchs, Claudia Plant, Irene Schicker, and Rosmarie DeWit

Based on the ERA5  data of hourly  meteorological parameters [1], we investigate temporal effects of  12 meteorological parameters on  the extreme values occurring in  wind speed.  We approach the problem by using the Granger causal inference, namely by the heterogeneous graphical Granger model (HGGM) [2]. In contrary to the classical Granger model proposed for causal inference among Gaussian processes, the HGGM detects causal relations among time series with distributions from the exponential family, which includes a wider class of common distributions. In previous synthetic experiments, HGGM combined with the genetic algorithm search based on the minimum message length principle has been shown superior in precision over the baseline causal methods [2].  We investigate various experimental settings of all 12 parameters with respect to the wind extremes in various time intervals. Moreover, we compare the influence of various data preprocessing methods and evaluate the interpretability of the discovered causal connections based on meteorological knowledge.

[1] https://cds.climate.copernicus.eu/cdsapp#!/dataset/reanalysis-era5-single-levels?tab=overview

[2] Behzadi, S, Hlaváčková-Schindler, K., Plant, C. (2019) Granger causality for heterogeneous processes, In: Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer, pp. 463-475.

[3] Hlaváčková-Schindler, K., Plant, C. (2020) Heterogeneous graphical Granger causality by minimum message length, Entropy, 22(1400). pp. 1-21 ISSN 1099-4300 MDPI (2020).

How to cite: Hlavackova-Schindler (Schindlerova), K., Fuchs, A., Plant, C., Schicker, I., and DeWit, R.: The influence of meteorological parameters on wind speed extreme events:  A causal inference approach, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-5756, https://doi.org/10.5194/egusphere-egu22-5756, 2022.

11:20–11:25
|
EGU22-9112
|
ECS
|
Presentation form not yet defined
Nicolas-Domenic Reiter, Jakob Runge, and Andreas Gerhardus

Understanding complex dynamical systems is a major challenge in many scientific disciplines. There are two aspects which are of particular interest when analyzing complex dynamical systems: 1) the temporal patterns along which they evolve and 2) the governing causal mechanisms.

Temporal patterns in a time-series can be extracted and analyzed through a variety of time-series representations, that is a collection of filters. Discrete Wavelet and Fourier Transforms are prominent examples and have been widely applied to investigate the temporal structure of dynamical systems.

Causal Inference is a framework formalizing questions of cause and effect. In this work we propose an elementary and systematic approach to combine time-series representations with Causal Inference. Hereby we introduce a notion of cause and effect with respect to a pair of arbitrary time-series filters. Using a Singular Value Decomposition we derive an alternative representation of how one process drives another over a specified time-period. We call the building blocks of this representation Causal Orthogonal Functions. Combining the notion of Causal Orthogonal Functions with a Wavelet or Fourier decomposition of a time-series yields time-scale specific Causal Orthogonal Functions. As a result we obtain a time-scale specific representation of the causal influence one process has on another over some fixed time-period. This allows to conduct causal effect analysis in discrete-time stochastic dynamical systems at multiple time-scales. We illustrate our approach by examining linear VAR processes.

How to cite: Reiter, N.-D., Runge, J., and Gerhardus, A.: Causal Orthogonal Functions: A Causal Inference approach to temporal feature extraction, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-9112, https://doi.org/10.5194/egusphere-egu22-9112, 2022.

11:25–11:30
|
EGU22-591
|
ECS
|
On-site presentation
Zheng Wu, Tom Beucler, Raphaël de Fondeville, Eniko Székely, Guillaume Obozinski, William Ball, and Daniela Domeisen

The winter stratospheric polar vortex exhibits considerable variability in both magnitude and zonal wave structure, which arises in part from stratosphere-troposphere coupling associated with tropospheric precursors and can result in extreme polar vortex events. These extremes can subsequently influence weather in the troposphere and thus are important sources of surface prediction. However, the predictability limit of these extreme events is around 1-2 weeks in the state-of-the-art prediction system. In order to explore and improve the predictability limit of the extreme vortex events, in this study, we train an artificial neural network (ANN) to model stratospheric polar vortex anomalies and to identify strong and weak stratospheric vortex events. To pinpoint the origins of the stratospheric anomalies, we then employ two neural network visualization methods, SHapley Additive exPlanations (SHAP) and Layerwise Relevance Propagation (LRP), to uncover feature importance in the input variables (e.g., geopotential height and background zonal wind). The extreme vortex events can be identified by the ANN with an averaged accuracy of 60-80%. For the correctly identified extreme events, the composite of the feature importance of the input variables shows spatial patterns consistent with the precursors found for extreme stratospheric events in previous studies. This consistency provides confidence that the ANN is able to identify reliable indicators for extreme stratospheric vortex events and that it could help to identify the role of the previously found precursors, such as the sea level pressure anomalies associated with the Siberian high. In addition to the composite of all the events, the feature importance for each of the individual events further reveals the physical structures in the input variables (such as the locations of the geopotential height anomalies) that are specific to that event. Our results show the potential of explainable neural networks techniques in understanding and predicting the stratospheric variability and extreme events, and in searching for potential precursors for these events on subseasonal time scales. 

How to cite: Wu, Z., Beucler, T., de Fondeville, R., Székely, E., Obozinski, G., Ball, W., and Domeisen, D.: Identifying precursors for extreme stratospheric polar vortex events  using an explainable neural network, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-591, https://doi.org/10.5194/egusphere-egu22-591, 2022.

11:30–11:35
|
EGU22-11388
|
ECS
|
Highlight
|
On-site presentation
Gherardo Varando, Miguel-Ángel Fernández-Torres, and Gustau Camps-Valls

Tackling climate change needs to understand the complex phenomena occurring on the Planet. Discovering  teleconnection patterns is an essential part of the endeavor. Events like El Niño Southern Oscillation (ENSO) impact essential climate variables at large distances, and influence the underlying Earth system dynamics. However, their automatic identification from the wealth of observational data is still unresolved. Nonlinearities, nonstationarities and the (ab)use of correlation analyses hamper the discovery of true causal patterns.  Classical approaches proceed by first, extracting principal modes of variability and second, by performing lag-correlations or Granger causal analysis to identify possible teleconnections. While the principal modes are an effective representation of the data, they could be causally not meaningful. 
To address this, we here introduce a deep learning methodology that extracts nonlinear latent representations from spatio-temporal Earth data that are Granger causal with the index altogether. The proposed algorithm consists of a variational autoencoder trained with an additional causal penalization that enforces the latent representation to be (partially) Granger-causally related to the considered signal. The causal loss term is obtained by training two additional autoregressive models to forecast some of the latent signals, one of them including the target signal as predictor. The causal penalization is finally computed by comparing the log variances of the two autoregressive models, similarly to the standard Granger causality approach. 

The major drawback of deep autoencoders with respect to the classical linear principal component approaches is the lack of a straightforward interpretability of the representations learned. 
To address this point we perform synthetic interventions in the latent space and analyse the differences in the recovered NDVI signal.
We illustrate the feasibility of the approach described to study the impact of ENSO on vegetation, which allows for a more rigorous study of impacts on ecosystems globally. The output maps show NDVI patterns which are consistent with the known phenomena induced by El Niño event. 

How to cite: Varando, G., Fernández-Torres, M.-Á., and Camps-Valls, G.: Learning ENSO-related Principal Modes of Vegetation via a Granger-Causal Variational Autoencoder, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-11388, https://doi.org/10.5194/egusphere-egu22-11388, 2022.

11:35–11:50
Lunch break
Chairpersons: Rochelle Schneider, Duncan Watson-Parris
13:20–13:25
13:25–13:35
|
EGU22-696
|
ECS
|
solicited
|
Highlight
|
Virtual presentation
Christina Heinze-Deml, Sebastian Sippel, Angeline G. Pendergrass, Flavio Lehner, and Nicolai Meinshausen

A key challenge in climate science is to quantify the forced response in impact-relevant variables such as precipitation against the background of internal variability, both in models and observations. Dynamical adjustment techniques aim to remove unforced variability from a target variable by identifying patterns associated with circulation, thus effectively acting as a filter for dynamically induced variability. The forced contributions are interpreted as the variation that is unexplained by circulation. However, dynamical adjustment of precipitation at local scales remains challenging because of large natural variability and the complex, nonlinear relationship between precipitation and circulation particularly in heterogeneous terrain. 

In this talk, I will present the Latent Linear Adjustment Autoencoder (LLAAE), a novel statistical model that builds on variational autoencoders. The Latent Linear Adjustment Autoencoder enables estimation of the contribution of a coarse-scale atmospheric circulation proxy to daily precipitation at high resolution and in a spatially coherent manner. To predict circulation-induced precipitation, the LLAAE combines a linear component, which models the relationship between circulation and the latent space of an autoencoder, with the autoencoder's nonlinear decoder. The combination is achieved by imposing an additional penalty in the cost function that encourages linearity between the circulation field and the autoencoder's latent space, hence leveraging robustness advantages of linear models as well as the flexibility of deep neural networks. 

We show that our model predicts realistic daily winter precipitation fields at high resolution based on a 50-member ensemble of the Canadian Regional Climate Model at 12 km resolution over Europe, capturing, for instance, key orographic features and geographical gradients. Using the Latent Linear Adjustment Autoencoder to remove the dynamic component of precipitation variability, forced thermodynamic components are expected to remain in the residual, which enables the uncovering of forced precipitation patterns of change from just a few ensemble members. We extend this to quantify the forced pattern of change conditional on specific circulation regimes. 

Future applications could include, for instance, weather generators emulating climate model simulations of regional precipitation, detection and attribution at subcontinental scales, or statistical downscaling and transfer learning between models and observations to exploit the typically much larger sample size in models compared to observations.

How to cite: Heinze-Deml, C., Sippel, S., Pendergrass, A. G., Lehner, F., and Meinshausen, N.: Latent Linear Adjustment Autoencoder: a novel method for estimating dynamic precipitation at high resolution, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-696, https://doi.org/10.5194/egusphere-egu22-696, 2022.

13:35–13:40
|
EGU22-10120
|
Virtual presentation
Alice Crespi, Daniel Frisinghelli, Tatiana Klisho, Marcello Petitta, Alexander Jacob, and Massimiliano Pittore

Statistical downscaling is a very popular technique to increase the spatial resolution of existing global and regional climate model simulations and to provide reliable climate data at local scale. The availability of tailored information is particularly crucial for conducting local climate assessments, climate change studies and for running impact models, especially in complex terrain. A crucial requirement is the ability to reliably downscale the mean, variability and extremes of climate data, while preserving their spatial and temporal correlations.

Several machine learning-based approaches have been proposed so far to perform such task by extracting non-linear relationships between local-scale variables and large-scale atmospheric predictors and they could outperform more traditional statistical methods. In recent years, deep learning has gained particular interest in geoscientific studies and climate science as a promising tool to improve climate downscaling thanks to its greater ability to extract high-level features from large datasets using complex hierarchical architectures. However, the proper network architecture is highly dependent on the target variable, time and spatial resolution, as well as application purposes and target domain.

This contribution presents a Deep Convolutional Encoder-Decoder Network (DCEDN) architecture which was implemented and evaluated for the first time over Trentino-South Tyrol in the Eastern Italian Alps to derive 1-km climate fields of daily temperature and precipitation from ERA-5 reanalysis. We will show that in-depth optimization of hyper-parameters, loss function choice and sensitivity analyses are essential preliminary steps to derive an effective architecture and enhance the interpretability of results and of their variability. The validation of downscaled fields of both temperature and precipitation confirmed the improved representation of local features for both mean and extreme values, even though lower performances were obtained for precipitation in reproducing small-scale spatial features. In all cases, DCEDN was found to outperform classical schemes based on linear regression and the bias adjustment procedures used as benchmarks. We will discuss in detail the advantages and recommendations for the integration of DCEDN as an efficient post-processing block in climate data simulations supporting local-scale studies. The model constraints in feature extraction, especially for precipitation, over the limited extent of the study domain will also be explained along with potential future developments of such type of networks for improved climate science applications.

How to cite: Crespi, A., Frisinghelli, D., Klisho, T., Petitta, M., Jacob, A., and Pittore, M.: A Convolutional Neural Network approach for downscaling climate model data in Trentino-South Tyrol (Eastern Italian Alps), EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-10120, https://doi.org/10.5194/egusphere-egu22-10120, 2022.

13:40–13:45
|
EGU22-12165
|
ECS
|
On-site presentation
Moritz Schwarz and Felix Pretis

Existing databases for extreme weather events such as floods, heavy rainfall events, or droughts are heavily reliant on authorities and weather services manually entering details about the occurrence of an event. This reliance has led to a massive geographical imbalance in the likelihood of extreme weather events being recorded, with a vast number of events especially in the developing world remaining unrecorded. With continuing climate change, a lack of systematic extreme weather accounting in developing countries can lead to a substantial misallocation of funds for adaptation measures. To address this imbalance, in this pilot study we combine socio-economic data with climate and geographic data and use several machine-learning algorithms as well as traditional (spatial) econometric tools to predict the occurrence of extreme weather events and their impacts in the absence of information from manual records. Our preliminary results indicate that machine-learning approaches for the detection of the impacts of extreme weather could be a crucial tool in establishing a coherent global disaster record system. Such systems could also play a role in discussions around future Loss and Damages.

How to cite: Schwarz, M. and Pretis, F.: Filling in the Gaps: Consistently detecting previously unidentified extreme weather event impacts, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-12165, https://doi.org/10.5194/egusphere-egu22-12165, 2022.

13:45–13:50
|
EGU22-13345
|
Virtual presentation
Xinxin Sui, Zhi Li, Guoqiang Tang, Zong-Liang Yang, and Dev Niyogi
Multiple environmental factors influence the error structure of precipitation datasets. The conventional precipitation evaluation method over-simply analyzes how the statistical indicators vary with one or two factors via dimensionality reduction. As a result, the compound influences of multiple factors are superposed rather than disassembled. To overcome this deficiency, this study presents a novel approach to systematically and objectively analyze the error structure within precipitation products using decision trees. This data-driven method can analyze multiple factors simultaneously and extract the compound effects of various influencers. By interpreting the decision tree structures, the error characteristics of precipitation products are investigated. Three types of precipitation products (two satellite-based: ‘top-down’ IMERG and ‘bottom-up’ SM2RAIN-ASCAT, and one reanalysis: ERA5-Land) are evaluated across CONUS. The study period is from 2010 to 2019, and the ground-based Stage IV precipitation dataset is used as the ground truth. By data mining 60 binary decision trees, the spatiotemporal pattern of errors and the land surface influences are analyzed.
 
Results indicate that IMERG and ERA5-Land perform better than SM2RAIN-ASCAT with higher accuracy and more stable interannual patterns for the ten years of data analyzed. The conventional bias evaluation finds that ERA5-Land and SM2RAIN-ASCAT underestimate in summer and winter, respectively. The decision tree method cross-assesses three spatiotemporal factors and finds that underestimation of ERA5-Land occurs in the eastern part of the rocky mountains, and SM2RAIN-ASCAT underestimates precipitation over high latitudes, especially in winter. Additionally, the decision tree method ascribes system errors to nine physical variables, of which the distance to the coast, soil type, and DEM are the three dominant features. On the other hand, the land cover classification and the topography position index are two relatively weak factors.

How to cite: Sui, X., Li, Z., Tang, G., Yang, Z.-L., and Niyogi, D.: A novel approach to systematically analyze the error structure of precipitation datasets using decision trees, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-13345, https://doi.org/10.5194/egusphere-egu22-13345, 2022.

13:50–13:55
|
EGU22-4584
|
ECS
|
Virtual presentation
Sumanta Chandra Mishra Sharma and Adway Mitra

Downscaling is widely used to improve spatial resolution of meteorological variables. Broadly there are two classes of techniques used for downscaling i.e. dynamical downscaling and statistical downscaling. Dynamical downscaling depends on the boundary conditions of coarse resolution global models like General Circulation Models (GCMs) for its operation whereas the statistical model tries to interpret the statistical relationship between the high-resolution and low-resolution data (Kumar et. al. 2021). With the rapid development of deep learning techniques in recent years, deep learning based super-resolution (SR) models have been designed for image processing and computer vision, for increasing the resolution of a given image. But many researchers from other fields have also adapted these techniques and achieved state-of-the-art performance in various domains. To the best of our knowledge, only a few works exist that have used the super-resolution methods in climate domain, for deep downscaling of precipitation data.

These super-resolution approaches mostly use convolutional neural networks (CNN) to accomplish their task. In CNN when we increase the depth of the model then there is a chance of information loss and error propagation (Vandal et.al.2017). To reduce this information loss, we have introduced residual-based deep downscaling models. These models have multiple residual blocks and skip connections between similar types of convolutional layers. The long skip connections in the model helps to reduce information loss in the network. These models take as input, data that is pre-upsampled by linear interpolation, and then improve the estimates of the pixel values.

In our experiments, we have focused on downscaling of rainfall over Indian landmass (for Indian summer monsoon rainfall) and for a region in the USA spanning the southeast CONUS and parts of its neighboring states that are present between the longitude 700 W to 1000 W and latitude 240 N to 400 N. The precipitation data for this task is collected from the India Meteorological Department (IMD), Pune, India, and NOAA Physical Science Laboratory. We have examined our model's predictive behavior and compared it with the existing super-resolution models like SRCNN and DeepSD, which have been earlier used for precipitation downscaling. In the DeepSD model, we have used the GTOPO30 land elevation data provided by USGS along with the precipitation data as input. All these models are trained and tested in both the geographical regions separately and it is found that the proposed model performs better than the existing models on multiple accuracy measures like PSNR, Correlation Coefficient, etc. for the specific region and scaling factor.

How to cite: Mishra Sharma, S. C. and Mitra, A.: Super-Resolution based Deep Downscaling of Precipitation, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-4584, https://doi.org/10.5194/egusphere-egu22-4584, 2022.

13:55–14:00
|
EGU22-8068
|
Presentation form not yet defined
Cody Nash, Balasubramanya Nadiga, and Xiaoming Sun

In this study we evaluate the use of generative adversarial networks (GANs) to model satellite-based estimates of precipitation conditioned on reanalysis temperature, humidity, wind, and surface latent heat flux.  We are interested in the climatology of precipitation and modeling it in terms of atmospheric state variables, in contrast to a weather forecast or precipitation nowcast perspective.  We consider a hierarchy of models in terms of complexity, including simple baselines, generalized linear models, gradient boosted decision trees, pointwise GANs and deep convolutional GANs. To gain further insight into the models we apply methods for analyzing machine learning models, including model explainability, ablation studies, and a diverse set of metrics for pointwise and distributional differences, including information theory based metrics.  We find that generative models significantly outperform baseline models on metrics based on the distribution of predictions, particularly in capturing the extremes of the distributions.  Overall, a deep convolutional model achieves the highest accuracy.  We also find that the relative importance of atmospheric variables and of their interactions vary considerably among the different models considered. 

How to cite: Nash, C., Nadiga, B., and Sun, X.: Generative Adversarial Modeling of Tropical Precipitation and the Intertropical Convergence Zone, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-8068, https://doi.org/10.5194/egusphere-egu22-8068, 2022.

14:00–14:05
|
EGU22-8454
|
Presentation form not yet defined
Emily Vosper, Dann Mitchell, Peter Watson, Laurence Aitchison, and Raul Santos-Rodriguez

Fluvial flood hazards from tropical cyclones (TCs) are frequently the leading cause of mortality and damages (Rezapour and Baldock, 2014). Accurately modeling TC precipitation is vital for studying the current and future impacts of TCs. However, general circulation models at typical resolution struggle to accurately reproduce TC rainfall, especially for the most extreme storms (Murakami et al., 2015). Increasing horizontal resolution can improve precipitation estimates (Roberts et al., 2020; Zhang et al., 2021), but as these methods are computationally expensive there is a trade-off between accuracy and generating enough ensemble members to generate sufficient high impact, low probability events. Often, downscaling models are used as a computationally cheaper alternative. 

Here, we downscale TC precipitation data from 100 km to 10 km resolution using a generative adversarial network (GAN). Generative approaches have the potential to well reproduce the fine spatial detail and stochastic nature of precipitation (Ravuri et al., 2021). Using observational products for tracking (IBTrACS) and rainfall (MSWEP), we train our GAN over the historical period 1979 - 2020. We are interested in how well our model reproduces precipitation intensity and structure with a focus on the most extreme events, where models have traditionally struggled. 

Bibliography 

Murakami, H., et al., 2015. Simulation and Prediction of Category 4 and 5 Hurricanes in the High-Resolution GFDL HiFLOR Coupled Climate Model*. Journal of Climate, 28(23), pp.9058-9079. 

Ravuri, S., et al., 2021. Skilful precipitation nowcasting using deep generative models of radar. Nature, 597(7878), pp.672-677. 

Rezapour, M. and Baldock, T., 2014. Classification of Hurricane Hazards: The Importance of Rainfall. Weather and Forecasting, 29(6), pp.1319-1331. 

Roberts, M., et al., 2020. Impact of Model Resolution on Tropical Cyclone Simulation Using the HighResMIP–PRIMAVERA Multimodel Ensemble. Journal of Climate, 33(7), pp.2557-2583. 

Zhang, W., et al., 2021. Tropical cyclone precipitation in the HighResMIP atmosphere-only experiments of the PRIMAVERA Project. Climate Dynamics, 57(1-2), pp.253-273. 

How to cite: Vosper, E., Mitchell, D., Watson, P., Aitchison, L., and Santos-Rodriguez, R.: Using Generative Adversarial Networks (GANs) to downscale tropical cyclone precipitation. , EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-8454, https://doi.org/10.5194/egusphere-egu22-8454, 2022.

14:05–14:10
|
EGU22-8649
|
ECS
|
Virtual presentation
Bernardo Teufel, Fernanda Carmo, Laxmi Sushama, Lijun Sun, Naveed Khaliq, Stephane Belair, Asaad Yahia Shamseldin, Dasika Nagesh Kumar, and Jai Vaze

The high computational cost of super-resolution (< 250 m) climate simulations is a major barrier for generating climate change information at such high spatial and temporal resolutions required by many sectors for planning local and asset-specific climate change adaptation strategies. This study couples machine learning and physical modelling paradigms to develop a computationally efficient simulator-emulator framework for generating super-resolution climate information. To this end, a regional climate model (RCM) is applied over the city of Montreal, for the summers of 2015 to 2020, at 2.5 km (i.e., low resolution – LR) and 250 m (i.e., high resolution – HR), which is used to train and validate the proposed super-resolution deep learning (DL) model. In the field of video super-resolution, convolutional neural networks combined with motion compensation have been used to merge information from multiple LR frames to generate high-quality HR images. In this study, a recurrent DL approach based on passing the generated HR estimate through time helps the DL model to recreate fine details and produce temporally consistent fields, resembling the data assimilation process commonly used in numerical weather prediction. Time-invariant HR surface fields and storm motion (approximated by RCM-simulated wind) are also considered in the DL model, which helps further improve output realism. Results suggest that the DL model is able to generate HR precipitation estimates with significantly lower errors than other methods used, especially for intense short-duration precipitation events, which often occur during the warm season and are required to evaluate climate resiliency of urban storm drainage systems. The generic and flexible nature of the developed framework makes it even more promising as it can be applied to other climate variables, periods and regions.

How to cite: Teufel, B., Carmo, F., Sushama, L., Sun, L., Khaliq, N., Belair, S., Shamseldin, A. Y., Nagesh Kumar, D., and Vaze, J.: Physically Based Deep Learning Framework to Model Intense Precipitation Events at Engineering Scales, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-8649, https://doi.org/10.5194/egusphere-egu22-8649, 2022.

14:10–14:15
|
EGU22-8656
|
ECS
|
Highlight
|
On-site presentation
Jakob Kruse, Beatrice Ellerhoff, Ullrich Köthe, and Kira Rehfeld

The socio-economic impacts of rare extreme events, such as droughts, are one of the main ways in which climate affects humanity. A key challenge is to quantify the changing risk of once-in-a-decade or even once-in-a-century events under global warming, while leaning heavily on comparatively short observation spans. The predictive power of classical statistical methods from extreme value theory (EVT) often remains limited to uncorrelated events with short return periods. This is mainly due to their strong assumption of an underlying exponential family distribution of the variable in question. Standard EVT is therefore at odds with the rich and large-scale correlations found in various surface climate parameters such as local temperatures, as well as the more complex shape of empirical distributions. Here, we turn to recent developments in machine learning, namely to conditional normalizing flows, which are flexible neural networks for modeling highly-correlated unknown distributions. Given a short time series, we show how such networks can model the posterior probability of events whose return periods are much longer than the observation span. The necessary correlations and patterns can be extracted from a paired set of inputs, i.e. time series, and outputs, i.e. return periods. To evaluate this approach in a controlled setting, we generate synthetic training data by sampling temporally autoregressive processes with a non-trivial covariance structure. We compare the results to a baseline analysis using EVT. In this work, we focus on the prediction of return periods of rare statistical events. However, we expect the same potential for a wide range of statistical measures, such as the power spectrum and rate functions. Future work should also investigate its applicability to compound and spatially extended events, as well as changing conditions under warming scenarios.

How to cite: Kruse, J., Ellerhoff, B., Köthe, U., and Rehfeld, K.: Conditional normalizing flow for predicting the occurrence of rare extreme events on long time scales, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-8656, https://doi.org/10.5194/egusphere-egu22-8656, 2022.

14:15–14:20
|
EGU22-8831
|
ECS
|
On-site presentation
Transfer learning for estimating dynamic precipitation across different climate models
(withdrawn)
Joel Kuettel, Sebastian Sippel, Christina Heinze-Deml, Reto Knutti, and Nicolai Meinshausen
14:20–14:25
|
EGU22-3105
|
ECS
|
On-site presentation
Sibille Wehrmann and Thomas Mölg

The interdisciplinary research project "BayTreeNet" investigates the reactions of forest ecosystems to current climate dynamics. In the mid-latitudes, local climatic phenomena often show a strong dependence on the large-scale climate dynamics, the weather types (WT), which significantly determine the climate of a region through frequency and intensity. In the topographically diverse region of Bavaria, different WT show various weather conditions at different locations.

The meaning of every WT is explained for the different forest regions in Bavaria and the results of the climate dynamics sub-project provide the physical basis for the "BayTreeNet" project. Subsequently, climate-growth relationships are established in the dendroecology sub-project to investigate the response of forests to individual WT at different forest sites. Complementary steps allow interpretation of results for the past (20th century) and projection into the future (21st century). One hypothesis to be investigated is that forest sites in Bavaria are affected by a significant influence of climate change in the 21st century and the associated change in WT.

The automated classification of large-scale weather patterns is presented by Self-Organizing-Maps (SOM) developed by Kohonen, which enables visualization and reduction of high-dimensional data. The poster presents the evaluation and selection of an appropriate SOM-setting and its first results. Besides, it is planned to show first analyses of the environmental conditions of the different WT and how these are represented in global climate models (GCMs) in the past and future.

How to cite: Wehrmann, S. and Mölg, T.: Classifying weather types in Europe by Self-Organizing-Maps (SOM) with regard to GCM-based future projections, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-3105, https://doi.org/10.5194/egusphere-egu22-3105, 2022.

14:25–14:30
|
EGU22-9877
|
ECS
|
Virtual presentation
Marcello Iotti, Paolo Davini, Jost von Hardenberg, and Giuseppe Zappa

Predicting extreme precipitation events is one of the main challenges of climate science in this decade. Despite the continuously increasing computing availability, Global Climate Models’ (GCMs) spatial resolution is still too coarse to correctly represent and predict small-scale phenomena as convection, so that precipitation prediction is still imprecise. Indeed, precipitation shows variability on both spatial and temporal scales (much) smaller than the current state-of-the-art GCMs resolution. Therefore, downscaling techniques play a crucial role, both for the understanding of the phenomenon itself and for applications like e.g. hydrologic studies, risk prediction and emergency management. Seen in the context of image processing, a downscaling procedure has many similarities with super-resolution tasks, i.e. the improvement of the resolution of an image. This scope has taken advantage from the application of Machine Learning techniques, and in particular from the introduction of Convolutional Neural Networks (CNNs).

In our work we exploit a conditional Generative Adversarial Network (cGAN) to train a generator model to perform precipitation downscaling. This generator, a deep CNN, takes as input the precipitation field at the scale resolved by GCMs, adds random noise, and outputs a possible realization of the precipitation field at higher resolution, preserving its statistical properties with respect to the coarse-scale field. The GAN is being trained and tested in a “perfect model” setup, in which we try to reproduce the ERA5 precipitation field starting from an upscaled version of it.

Compared to other downscaling techniques, our model has the advantage of being computationally inexpensive at run time, since the computational load is mostly concentrated in the training phase. We are examining the Greater Alpine Region, upon which numerical models performances are limited by the complex orography. Nevertheless the approach, being independent of physical, statistical and empirical assumptions, can be easily extended to different domains.

How to cite: Iotti, M., Davini, P., von Hardenberg, J., and Zappa, G.: A Conditional Generative Adversarial Network for Rainfall Downscaling, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-9877, https://doi.org/10.5194/egusphere-egu22-9877, 2022.

14:30–14:35
|
EGU22-10773
|
Highlight
|
On-site presentation
Campbell Watson, Jorge Guevara, Daniela Szwarcman, Dario Oliveira, Leonardo Tizzei, Maria Garcia, Priscilla Avegliano, and Bianca Zadrozny

Climate change is making extreme weather more extreme. Given the inherent uncertainty of long-term climate projections, there is growing need for rapid, plausible “what-if” climate scenarios to help users understand climate exposure and examine resilience and mitigation strategies. Since the 1980s, such “what-if” scenarios have been created using stochastic weather generators. However, it is very challenging for traditional weather generation algorithms to create realistic extreme climate scenarios because the weather data being modeled is highly imbalanced, contains spatiotemporal dependencies and has extreme weather events exacerbated by a changing climate.

There are few works comparing and evaluating stochastic multisite (i.e., gridded) weather generators, and no existing work that compares promising deep learning approaches for weather generation with classical stochastic weather generators. We will present the culmination of a multi-year effort to perform a systematic evaluation of stochastic weather generators and deep generative models for multisite precipitation synthesis. Among other things, we show that variational auto-encoders (VAE) offer an encouraging pathway for efficient and controllable climate scenario synthesis – especially for extreme events. Our proposed VAE schema selects events with different characteristics in the normalized latent space (from rare to common) and generates high-quality scenarios using the trained decoder. Improvements are provided via latent space clustering and bringing histogram-awareness to the VAE loss.

This research will serve as a guide for improving the design of deep learning architectures and algorithms for application in Earth science, including feature representation and uncertainty quantification of Earth system data and the characterization of so-called “grey swan” events.

How to cite: Watson, C., Guevara, J., Szwarcman, D., Oliveira, D., Tizzei, L., Garcia, M., Avegliano, P., and Zadrozny, B.: Choose your own weather adventure: deep weather generation for “what-if” climate scenarios, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-10773, https://doi.org/10.5194/egusphere-egu22-10773, 2022.

14:35–14:50
Coffee break
Chairpersons: Gustau Camps-Valls, Duncan Watson-Parris
15:10–15:15
15:15–15:20
|
EGU22-8848
|
ECS
|
Highlight
|
On-site presentation
Alyson Douglas and Philip Stier

Clouds remain a core uncertainty in quantifying Earth’s climate sensitivity due to their complex dynamical and microphysical  interactions with multiple components of the Earth system. Therefore it is pivotal to observationally constrain possible cloud changes in a changing climate in order to evaluate our current generation of Earth system models by a set of physically realistic sensitivities. We developed a novel observational regime framework from over 15 years of MODIS satellite observations, from which we have derived a set of regimes of cloud controlling factors. These regimes were established using the relationship strength, as measured by using the weights of a trained, simple machine learning model. We apply these as observational constraints on the ​​r1i1p1f1 and r1i1p1f3 historical runs from various CMIP6 models to test if CMIP6 climate models can accurately represent key cloud controlling factors.. Within our regime framework, we can compare the observed environmental drivers and sensitivities of each regime against the parameterization-driven, modeled outcomes. We find that, for almost every regime, CMIP6 models do not properly represent the global distribution of occurrence, raising into question how much we can trust our range of climate sensitivities when specific cloud controlling factors are so badly represented by these models. This is especially pertinent in southern ocean and marine stratocumulus regimes, as the changes in these clouds’ optical depths and cloud amount have increased the ECS from CMIP5 to CMIP6. Our results suggest that these uncertainties in CMIP6 cloud parameterizations propagate into derived cloud feedbacks and ultimately climate sensitivity, which is evident from a regimed based analysis of cloud controlling factors.

How to cite: Douglas, A. and Stier, P.: Defining regime specific cloud sensitivities using the learnings from machine learning, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-8848, https://doi.org/10.5194/egusphere-egu22-8848, 2022.

15:20–15:25
|
EGU22-9250
|
ECS
|
Presentation form not yet defined
Hussein El Khansa, Carmen Gervet, and Audrey Brouillet

Outliers detection generally aims at identifying extreme events and insightful changes in climate behavior. One important type of outlier is pattern outlier also called discord, where the outlier pattern detected covers a time interval instead of a single point in the time series. Machine learning contributes many algorithms and methods in this field especially unsupervised algorithms for different types of data time series. In a first submitted paper, we have investigated discord detection applied to climate-related impact observations. We have introduced the prominent discord notion, a contextual concept that derives a set of insightful discords by identifying dependencies among variable length discords, and which is ordered based on the number of discords they subsume. 

Following this study, here we propose a ranking function based on the length of the first subsumed discord and the total length of the prominent discord, and make use of the powerful matrix profile technique. Preliminary results show that our approach, applied to monthly runoff timeseries between 1902 and 2005 over West Africa, detects both the emergence of long term change with the associated former climate regime, and the regional driest decade (1982-1992) of the 20th century (i.e. climate extreme event). In order to demonstrate the genericity and multiple insights gained by our method, we go further by evaluating the approach on other impact (e.g. crop data, fires, water storage) and climate (precipitation and temperature) observations, to provide similar results on different variables, extract relationships among them and identify what constitutes a prominent discord in such cases. A further step will consist in evaluating our methodology on climate and impact historical simulations, to determine if prominent discords highlighted in observations can be captured in climate and impact models.

How to cite: El Khansa, H., Gervet, C., and Brouillet, A.: Prominent discords in climate data through matrix profile techniques: detecting emerging long term pattern changes and anomalous events  , EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-9250, https://doi.org/10.5194/egusphere-egu22-9250, 2022.

15:25–15:30
|
EGU22-11216
|
ECS
|
Virtual presentation
Noémie Planat and Mathilde Jutras

Lagrangian studies are a widely-used and powerful way to analyse and interpret phenomenons in oceanography and atmospheric sciences. Such studies can be based on dataset either consisting of real trajectories (e.g. oceanic drifters or floats) or of virtual trajectories computed from velocity outputs from model or observation-derived velocities. Such data can help investigate pathways of water masses, pollutants or storms, or identify important convection areas to name a few. As many of these analyses are based on large volumes of data that can be challenging to examine, machine learning can provide an efficient and automated way to classify information or detect patterns.

Here, we present an application of unsupervised clustering to the identification of the main pathways of the shelf-break branch of the Labrador Current, a critical component of the North Atlantic circulation. The current flows southward along the Labrador Shelf and splits in the region of the Grand Banks, either retroflecting north-eastward and feeding the subpolar basin of the North Atlantic Ocean (SPNA) or continuing westward along the shelf-break, feeding the Slope Sea and the east coast of North America. The proportion feeding each area impacts their salinity and convection, as well as their biogeochemistry, with consequences on marine life.

Our dataset is composed of millions of virtual particle trajectories computed from the water velocities of the GLORYS12 ocean reanalysis. We implement an unsupervised Machine Learning clustering algorithm on the shape of the trajectories. The algorithm is a kernalized k-means++ algorithm with a minimal number of hyperparameters, coupled to a kernalized Principal Component Analysis (PCA) features reduction. We will present the pre-processing of the data, as well as canonical and physics-based methods for choosing the hyperparameters. 

The algorithm identifies six main pathways of the Labrador Current. Applying the resulting classification method to 25 years of ocean reanalysis, we quantify the relative importance of the six pathways in time and construct a retroflection index that is used to study the drivers of the retroflection variability. This study highlights the potential of such a simple clustering method for Lagrangian trajectory analysis in oceanography or in other climate applications.

How to cite: Planat, N. and Jutras, M.: Unsupervised clustering of Lagrangian trajectories in the Labrador Current, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-11216, https://doi.org/10.5194/egusphere-egu22-11216, 2022.

15:30–15:35
|
EGU22-3009
|
ECS
|
Presentation form not yet defined
Amirhossein Hassani, Núria Castell, and Philipp Schneider

Mapping the spatio-temporal distribution of near-surface urban air temperature is crucial to our understanding of climate-sensitive epidemiology, indoor-outdoor thermal comfort, urban biodiversity, and interactive impacts of climate change and urbanity. Urban-scale decision-making in face of future climatic uncertainties requires detailed information on near-surface air temperature at high spatio-temporal resolutions. However, reaching such fine resolutions cannot be currently realised by traditional observation networks, or even by regional or global climate models (Hamdi et al. 2020). Given the complexity of the processes affecting air temperature at the urban scale to the regional scale, here we apply Machine Learning (ML) algorithms, in particular, XGBoost gradient boosting method to build predictive models of near surface air temperature (Ta at 2-meter height). These predictive models establish data-driven relations between crowd-sourced measured Ta (data produced by citizens’ sensors) and a set of spatial and spatio-temporal predictors, primarily derived from Earth Observation satellite data including Modis Aqua/Landsat 8 Land Surface Temperature (LST), Modis Terra vegetative indices, and Sentinel-2 water vapour product. We use our models to predict sub-daily (at Modis Aqua satellite passing times) variation in urban scale Ta in city of Warsaw, Poland at spatial resolution of 1 km for the months July-September and the years 2016 to 2021. A 10-fold cross-validation of the developed models shows a root mean square error between 0.97 and 1.02 °C and a coefficient of determination between 0.96 and 0.98, which are satisfactory according to the literature (Taheri-Shahraiyni and Sodoudi 2017). The resulting maps allow us to identify regions of Warsaw that are vulnerable to heat stress. The strength of the method used here is that it can be easily replicated in other EU cities to achieve high resolution maps due to the accessibility and open-sourced nature of the training and predictor data. Contingent on data availability, the predictive framework developed also can be used for monitoring and downscaling of other urban governing climatic parameters such as relative humidity in the context of future climate uncertainties.

Hamdi, R., H. Kusaka, Q.-V. Doan, P. Cai, H. He, G. Luo, W. Kuang, S. Caluwaerts, F. Duchêne, B. J. E. S. Van Schaeybroek and Environment (2020). "The state-of-the-art of urban climate change modeling and observations." 1-16.

Taheri-Shahraiyni, H. and S. J. T. S. Sodoudi (2017). "High-resolution air temperature mapping in urban areas: A review on different modelling techniques."  21(6 Part A): 2267-2286.

How to cite: Hassani, A., Castell, N., and Schneider, P.: Application of Machine Learning for spatio-temporal mapping of the air temperature in Warsaw, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-3009, https://doi.org/10.5194/egusphere-egu22-3009, 2022.

15:35–15:40
|
EGU22-4853
|
ECS
|
On-site presentation
Sebastiaan Jamaer, Jérôme Neirynck, and Nicole van Lipzig

Recent studies have shown that the increasing sizes of offshore wind farms can cause a reduced energy production through mesoscale interactions with the atmosphere. Therefore, accurate nowcasting of the energy yields of large offshore wind farms depend on accurate predictions of the large synoptic weather systems as well as accurate predictions of the smaller mesoscale weather systems. In general, global or regional forecasting models are very well suited to predict synoptic-scale weather systems. However, satellite or radar data can support the nowcasting of shorter, smaller-scale systems. 

In this work, a first step towards nowcasting of the mesoscale wind using satellite images has been taken, namely the coupling of the mesoscale wind component to cloud properties that are available from satellite images using a deep learning framework. To achieve this, a high-resolution regional atmospheric model (COSMO-CLM) was used to generate one year of high resolution cloud en hub-height wind data. From this wind data the mesoscale component was filtered out and used as target images for the deep learning model. The input images of the model were several cloud-related fields from the atmospheric model. The model itself was a Deep Convolutional Neural Network (a U-Net) which was trained to minimize the mean squared error. 

This analysis indicates that cloud information can be used to extract information about the mesoscale weather systems and could be used for nowcasting by using the trained U-Net as a basis for a temporal deep learning model. However, future validation with real-world data is still needed to determine the added value of such an approach.

How to cite: Jamaer, S., Neirynck, J., and van Lipzig, N.: Can cloud properties provide information on surface wind variations using deep learning?, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-4853, https://doi.org/10.5194/egusphere-egu22-4853, 2022.

15:40–15:45
|
EGU22-5058
|
ECS
|
On-site presentation
Dwaipayan Chatterjee, Hartwig Deneke, and Susanne Crewell

With ever-increasing resolution, geostationary satellites are able to reveal the complex structure and organization of clouds. How cloud systems organize is important for the local climate and strongly connects to the Earth's response to warming through cloud system feedback.

Motivated by recent developments in computer vision for pattern analysis of uncurated images, our work aims to understand the organization of cloud systems based on high-resolution cloud optical depth images. We are exploiting the self-learning capability of a deep neural network to classify satellite images into different subgroups based on the distribution pattern of the cloud systems.

Unlike most studies, our neural network is trained over the central European domain, which is characterized by strong land surface type and topography variations. The satellite data is post-processed and retrieved at a higher spatio-temporal resolution (2 km, 5 min), enhanced by 66% compared to the current standard, equivalent to the future Meteosat third-generation satellite, which will be launched soon.

We show how recent advances in deep learning networks are used to understand clouds' physical properties in temporal and spatial scales. In a purely data-driven approach, we avoid the noise and bias obtained from human labeling, and with proper scalable techniques, it takes 0.86 ms and 2.13 ms to label an image at two different spatial configurations. We demonstrate explainable artificial intelligence (XAI), which helps gain trust for the neural network's performance.

To generalize the results, a thorough quantified evaluation is done on two spatial domains and two-pixel configurations (128x128, 64x64). We examine the uncertainty associated with distinct machine-detected cloud-pattern categories. For this, the learned features of the satellite images are extracted from the trained neural network and fed to an independent hierarchical - agglomerative algorithm. Therefore the work also explores the uncertainties associated with the automatic machine-detected patterns and how they vary with different cloud classification types.

How to cite: Chatterjee, D., Deneke, H., and Crewell, S.: Can satellite images provide supervision for cloud systems characterization?, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-5058, https://doi.org/10.5194/egusphere-egu22-5058, 2022.

15:45–15:50
|
EGU22-6093
|
ECS
|
On-site presentation
Jessenia Gonzalez, Odran Sourdeval, Gustau Camps-Valls, and Johannes Quaas

The Earth's radiation budget may be altered by changes in atmospheric composition or land use. This is called radiative forcing. Among the human-generated influences in radiative forcing, aerosol-cloud interactions are the least understood. A way to quantify a key uncertainty in this regard, the adjustment of cloud liquid water path (LWP), is by the ratio (sensitivity) of LWP to changes in cloud droplet number concentration (Nd). A key problem in quantifying this sensitivity from large-scale observations is that these two quantities are not retrieved by operational satellite products and are subject to large uncertainties. 

In this work, we use machine learning techniques to show that inferring LWP and Nd directly from satellite observation data may yield a better understanding of this relationship without using retrievals, which may lead to large and systematic uncertainties. In particular, we use supervised learning on the basis of available high-resolution ICON-LEM (ICOsahedral Non-hydrostatic Large Eddy Model) simulations from the HD(CP)² project (High Definition Clouds and Precipitation for advancing Climate Prediction) and forward-simulated radiances obtained from the radiative transfer modeling (RTTOV, Radiative Transfer for TOVS) which uses MODIS (Moderate Resolution Imaging Spectroradiometer) data as a reference. Usually, only two channels from the reflectance of MODIS can be used to estimate the LWP and Nd. However, having access to 36 bands allows us to exploit data and find other patterns to get these parameters directly from the observation space rather than from the retrievals. A machine learning model is used to create an emulator which approximates the Radiative Transfer Model, and another machine learning model to directly predict the sensitivity of LWP - Nd from the satellite observation data.

How to cite: Gonzalez, J., Sourdeval, O., Camps-Valls, G., and Quaas, J.: Machine learning to quantify cloud responses to aerosols from satellite data, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-6093, https://doi.org/10.5194/egusphere-egu22-6093, 2022.

15:50–15:55
|
EGU22-6466
|
ECS
|
Presentation form not yet defined
Nataliya Tkachenko and Laura Garcia Velez

Microclimate is a relatively recent concept in atmospheric sciences, which started drawing attention of engineers and climatologists after proliferation of the open thermal (infrared, middle- and near-infrared) remote sensing instruments and high-resolution emissivity datasets. Rarely mentioned in the context of global climate change reversing, efficient management of microclimates nevertheless can be considered as a possible solution. Their function is bi-directional; On one hand, they can perform as ‘buffers’ by smoothing out effects of the already altered global climate on people and ecosystems, whilst also acting as the structural contributors to perturbations in the higher layers of the atmosphere. 

In the most abstract terms, microclimates tend to manifest themselves via land surface temperature conditions, which in turn are highly sensitive to the underlying land cover and use decisions. Forests are considered as the most efficient terrestrial carbon sinks and climate regulators, and various forms, configurations and continuity of logging can substantially alter the patterns of local temperature fluxes, precipitation and ecosystems. In this study we propose a novel heteroskedastic machine learning method, which can attribute localised forest loss patches due to industrial mining activity and estimate the resulting change in dynamics of the surrounding microclimate(s). 

How to cite: Tkachenko, N. and Garcia Velez, L.: Global attribution of microclimate dynamics to industrial deforestation sites using thermal remote sensing and machine learning , EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-6466, https://doi.org/10.5194/egusphere-egu22-6466, 2022.

15:55–16:00
|
EGU22-676
|
ECS
|
Virtual presentation
Arndt Kaps, Axel Lauer, Gustau Camps-Valls, Pierre Gentine, Luis Gómez-Chova, and Veronika Eyring

Clouds play a key role in weather and climate but are quite challenging to simulate with global climate models as the relevant physics include non-linear processes on scales covering several orders of magnitude in both the temporal and spatial dimensions. The numerical representation of clouds in global climate models therefore requires a high degree of parameterization, which makes a careful evaluation a prerequisite not only for assessing the skill in reproducing observed climate but also for building confidence in projections of future climate change. Current methods to achieve this usually involve the comparison of multiple large-scale physical properties in the model output to observational data. Here, we introduce a two-stage data-driven machine learning framework for process-oriented evaluation of clouds in climate models based directly on widely known cloud types. The first step relies on CloudSat satellite data to assign cloud labels in line with cloud types defined by the World Meteorological Organization (WMO) to MODIS pixels using deep neural networks. Since the method is supervised and trained on labels provided by CloudSat, the predicted cloud types remain objective and do not require a posteriori labeling. The second step consists of a regression algorithm that predicts fractional cloud types from retrieved cloud physical variables. This step aims to ensure that the method can be used with any data set providing physical variables comparable to MODIS. In particular, we use a Random Forest regression that acts as a transfer model to evaluate the spatially relatively coarse output of climate models and allows the use of varying input features. As a proof of concept, the method is applied to coarse grained ESA Cloud CCI data. The predicted cloud type distributions are physically consistent and show the expected features of the different cloud types. This demonstrates how advanced observational products can be used with this method to obtain cloud type distributions from coarse data, allowing for a process-based evaluation of clouds in climate models.

How to cite: Kaps, A., Lauer, A., Camps-Valls, G., Gentine, P., Gómez-Chova, L., and Eyring, V.: A two-stage machine learning framework using global satellite data of cloud classes for process-oriented model evaluation, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-676, https://doi.org/10.5194/egusphere-egu22-676, 2022.

16:00–16:05
|
EGU22-6998
|
ECS
|
On-site presentation
Sarah Brüning, Holger Tost, and Stefan Niebler

Clouds and their radiative feedback mechanisms are of vital importance for the atmospheric cycle of the Earth regarding global weather today as well as climate changes in the future. Climate models and simulations are sensitive to the vertical distribution of clouds, emphasizing the need for broadly accessible fine resolution data. Although passive satellite sensors provide continuous cloud monitoring on a global scale, they miss the ability to infer physical properties below the cloud top. Active instruments like radar are particularly suitable for this task but lack an adequate spatio-temporal resolution. Here, recent advances in Deep-Learning models open up the possibility to transfer spatial information from a 2D towards a 3D perspective on a large-scale.

By an example period in 2017, this study aims to explore the feasibility and potential of neural networks to reconstruct the vertical distribution of volumetric radar data along a cloud’s column. For this purpose, the network has been tested on the Full Disk domain of a geostationary satellite with high spatio-temporal resolution data. Using raw satellite channels, spectral indices, and topographic data, we infer the 3D radar reflectivity from these physical predictors. First results demonstrate the network’s capability to reconstruct the cloud vertical distribution. Finally, the ultimate goal of interpolating the cloud column for the whole domain is supported by a considerably high accuracy in predicting the radar reflectivity. The resulting product can open up the opportunity to enhance climate models by an increased spatio-temporal resolution of 3D cloud structures.

How to cite: Brüning, S., Tost, H., and Niebler, S.: Inferring the Cloud Vertical Distribution from Geostationary Satellite Data, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-6998, https://doi.org/10.5194/egusphere-egu22-6998, 2022.

16:05–16:10
|
EGU22-7355
|
ECS
|
On-site presentation
Julien Lenhardt, Johannes Quaas, and Dino Sejdinovic

Cloud base height (CBH) is an important geometric parameter of a cloud and shapes its radiative properties. The CBH is also further of practical interest in the aviation community regarding pilot visibility and aircraft icing hazards. While the cloud-top height has been successfully derived from passive imaging radiometers on satellites during recent years, the derivation of the CBH remains a more difficult challenge with these same retrievals.

In our study we combine surface observations and passive satellite remote-sensing retrievals to create a database of CBH labels and cloud properties to ultimately train a machine learning model predicting CBH. The labels come from the global marine meteorological observations dataset (UK Met Office, 2006) which consists of near-global synoptic observations made on sea. This data set provides information about CBH, cloud type, cloud cover and other meteorological surface quantities with CBH being the main interest here. The features based upon which the machine learning model is trained consist in different cloud-top and cloud optical properties (Level 2 products MOD06/MYD06 from the MODIS sensor) extracted on a 127km x 127km grid around the synoptic observation point. To study the large diversity in cloud scenes, an auto-encoder architecture is chosen. The regression task is then carried out in the modelled latent space which is output by the encoder part of the model. To account for the spatial relationships in our input data the model architecture is based on Convolutional Neural Networks. We define a study domain in the Atlantic ocean, around the equator. The combination of information from below and over the cloud could allow us to build a robust model to predict CBH and then extend predictions to regions where surface measurements are not available.

How to cite: Lenhardt, J., Quaas, J., and Sejdinovic, D.: Combining cloud properties and synoptic observations to predict cloud base height using Machine Learning, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-7355, https://doi.org/10.5194/egusphere-egu22-7355, 2022.

16:10–16:15
|
EGU22-11451
|
ECS
|
Presentation form not yet defined
Marie Bouillon, Sarah Safieddine, Simon Whitburn, Lieven Clarisse, Filipe Aires, Victor Pellet, Olivier Lezeaux, Noëlle A. Scott, Marie Doutriaux-Boucher, and Cathy Clerbaux

The IASI remote sensor measures Earth’s thermal infrared radiation over 8461 channels between 645 and 2760 cm-1. Atmospheric temperatures at different altitudes can be retrieved from the radiances measured in the CO2 absorption bands (645-800 cm-1 and 2250-2400 cm-1) by selecting the channels that are the most sensitive to the temperature profile. The three IASI instruments on board of the Metop suite of satellites launched in 2006, 2012 and 2018, will provide a long time series for temperature, adequate for studying the long term evolution of atmospheric temperature. However, over the past 14 years, EUMETSAT, who processes radiances and computes atmospheric temperatures, has carried out several updates on the processing algorithms for both radiances and temperatures, leading to non-homogeneous time series and thus large difficulties in the computation of trends for temperature and atmospheric composition.

 

In 2018, EUMETSAT has reprocessed the radiances with the most recent version of the algorithm and there is now a homogeneous radiance dataset available. In this study, we retrieve a new temperature record from the homogeneous IASI radiances using an artificial neural network (ANN). We train the ANN with IASI radiances as input and the European Centre for Medium-Range Weather Forecasts reanalysis ERA5 temperatures as output. We validate the results using ERA5 and in situ radiosonde temperatures from the ARSA database. Between 750 and 7 hPa, where IASI has most of its sensitivity, a very good agreement is observed between the 3 datasets. This work suggests that ANN can be a simple yet powerful tool to retrieve IASI temperatures at different altitudes in the upper troposphere and in the stratosphere, allowing us to construct a homogeneous and consistent temperature data record.

 

We use this new dataset to study extreme events such as sudden stratospheric warmings, and to compute trends over the IASI coverage period [2008-2020]. We find that in the past thirteen years, there is a general warming trend of the troposphere, that is more important at the poles and at mid latitudes (0.5 K/decade at mid latitudes, 1 K/decade at the North Pole). The stratosphere is globally cooling on average, except at the South Pole as a result of the ozone layer recovery and a sudden stratospheric warming in 2019. The cooling is most pronounced in the equatorial upper stratosphere (-1 K/decade).

How to cite: Bouillon, M., Safieddine, S., Whitburn, S., Clarisse, L., Aires, F., Pellet, V., Lezeaux, O., Scott, N. A., Doutriaux-Boucher, M., and Clerbaux, C.: Time evolution of temperature profiles retrieved from 13 years of IASI data using an artificial neural network, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-11451, https://doi.org/10.5194/egusphere-egu22-11451, 2022.

16:15–16:20
|
EGU22-9281
|
ECS
|
Virtual presentation
Eike Bolmer, Adili Abulaitijiang, Jürgen Kusche, Luciana Fenoglio-Marc, Sophie Stolzenberger, and Ribana Roscher

The automatic detection and tracking of mesoscale ocean eddies, the ‘weather of the ocean’, is a well-known task in oceanography. These eddies have horizontal scales from 10 km up to 100 km and above. They transport water mass, heat, nutrition, and carbon and have been identified as hot spots of biological activity. Monitoring eddies is therefore of interest among others to marine biologists and fishery. 
Recent advances in satellite-based observation for oceanography such as sea surface height (SSH) and sea surface temperature (SST) result in a large supply of different data products in which eddies are visible. In radar altimetry observations are acquired with repeat cycles between 10 and 35 days and cross-track spacing of a few 10 km to a few 100 km. Therefore, ocean eddies are clearly visible but typically covered by only one ground track. In addition, due to their motion, eddies are difficult to reconstruct, which makes creating detailed maps of the ocean with a high temporal resolution a challenge. In general, they are considered a perturbation, and their influence on altimetry data is difficult to determine, which is especially limiting for the determination of an accurate time-averaged dynamic topography of the ocean.
Due to their spatio-temporal dynamic behavior the identification and tracking are challenging. There is a number of methods that have been developed to identify and track eddies in gridded maps of sea surface height derived from multi-mission data sets. However, these procedures have shortcomings since the gridding process removes information that is valuable in achieving more accurate results.
Therefore, in the project EDDY carried out at the University of Bonn we intend to use ground track data from satellite altimetry and - as a long-term goal - additional remote sensing data such as SST, optical imagery, as well as statistical information from model outputs. The combination of the data will serve as a basis for a multi-modal deep learning algorithm. In detail, we will utilize transformers, a deep neural network architecture, that originates from the field of Natural Language Processing (NLP) and became popular in recent years in the field of computer vision. This method shows promising results in terms of understanding temporal and spatial information, which is essential in detecting and tracking highly dynamic eddies.
In this presentation, we introduce the deep neural network used in the EDDY project and show the results based on gridded data sets for the Gulf stream area for the period 2017 and first results of single-track eddy identification in the region.

How to cite: Bolmer, E., Abulaitijiang, A., Kusche, J., Fenoglio-Marc, L., Stolzenberger, S., and Roscher, R.: Machine learning-based identification and classification of ocean eddies, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-9281, https://doi.org/10.5194/egusphere-egu22-9281, 2022.

16:20–16:25
|
EGU22-2988
|
ECS
|
Virtual presentation
James Fulton and Ben Clarke

Global circulation models (GCMs) form the basis of a vast portion of earth system research and inform our climate policy. However, our climate system is complex and connected across scales. To simulate it, we must use parameterisations. These parameterisations, which are present in all models, can have a detectable influence on the GCM outputs.

GCMs are improving, but we need to use their current output to optimally estimate the risks of extreme weather. Therefore, we must debias GCM outputs with respect to observations. Current debiasing methods cannot correct both spatial correlations and cross-variable correlations. This limitation means current methods can produce physically implausible weather events - even when the single-location, single-variable distributions match the observations. This limitation is very important for extreme event research. Compound events like heat and drought, which drastically increase wildfire risk, and spatially co-occurring events like multiple bread-basket failures, are not well corrected by these current methods.

We propose using unsupervised image-to-image translations networks to perform bias correction of GCMs. These neural network architectures are used to translate (perform bias correction) between different image domains. For example, they have been used to translate computer-generated city scenes into real-world photos, which requires spatial and cross-variable correlations to be translated. Crucially, these networks learn to translate between image domains without requiring corresponding pairs of images. Such pairs cannot be generated between climate simulations and observations due to the inherent chaos of weather.

In this work, we use these networks to bias correct historical recreation simulations from the HadGEM3-A-N216 atmosphere-only GCM with respect to the ERA5 reanalysis dataset. This GCM has a known bias in simulating the South Asian monsoon, and so we focus on this region. We show the ability of neural networks to correct this bias, and show how combining the neural network with classical techniques produces a better bias correction than either method alone. 

How to cite: Fulton, J. and Clarke, B.: Correcting biases in climate simulations using unsupervised image-to-image-translation networks, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-2988, https://doi.org/10.5194/egusphere-egu22-2988, 2022.

16:25–16:40