Spatio-temporal data science: advances in computational geosciences and innovative evaluation tools for weather and climate science



Most of the processes studied by geoscientists are characterized by variations in both space and time. These spatio-temporal phenomena have been traditionally investigated using linear statistical approaches, as in the case of physically-based models and geostatistical models. Additionally, the rising attention toward machine learning, as well as the rapid growth of computational resources, opens new horizons in understanding, modeling, and forecasting complex spatio-temporal systems through the use of stochastics non-linear models.

These issues are particularly relevant in the field of performance evaluations of Earth Systems Science Prediction (ESSP) systems. A central issue in this domain deals with the representativeness of observational data used for evaluation, which are often not representative of the physical structures that are being predicted. While many large spatial and temporal observations datasets can help provide this information, adequate tools to integrate these large datasets to provide meaningful physical insights on the strengths and weaknesses of predicted fields are required. Other challenges deal with the large storage volumes to handle model simulations, large spatio-temporal datasets, and verification statistics which are difficult to maintain.

This session aims at exploring the new challenges and opportunities opened by the spread of data-driven statistical learning approaches in Earth and Soil Sciences. We invite cutting-edge contributions related to methods of spatio-temporal geostatistics or data mining on topics that include, but are not limited to:
- meaningful and informative model evaluation frameworks and platforms for ESSP;
- advances in spatio-temporal modeling using geostatistics and machine learning;
- uncertainty quantification and representation;
- innovative techniques of knowledge extraction based on clustering, pattern recognition and, more generally, data mining.

The main applications will be closely related to the research in Earth system science and quantitative geography. A non-complete list of possible applications includes:
- weather and climate (e.g. numerical weather prediction, hydrologic prediction, climate prediction and projection);
- natural and anthropogenic hazards (e.g. floods; landslides; earthquakes; wildfires; air pollution);
- interaction between geosphere and anthroposphere (e.g. land degradation; urban sprawl);
- socio-economic sciences (e.g. census data; transport; commuter traffic).

This is a merged session of “Spatio-temporal data science: theoretical advances and applications in computational geosciences” and “Innovative Evaluation Frameworks and Platforms for Weather and Climate Research”.

Co-organized by GI2/NP4
Convener: Federico AmatoECSECS | Co-conveners: Jerome Servonnat, Daniela Castro-CamiloECSECS, Fabian GuignardECSECS, Christopher KadowECSECS, Paul Kucera, Luigi Lombardo, Marj Tonini
vPICO presentations
| Mon, 26 Apr, 09:00–10:30 (CEST)

vPICO presentations: Mon, 26 Apr

Chairpersons: Federico Amato, Jerome Servonnat
Part I - Spatio-temporal data science: theoretical advances and applications in computational geosciences
Aoibheann Brady, Jonathan Rougier, Yann Ziegler, Bramha Dutt Vishwakarma, Sam Royston, Stephen Chuter, Richard Westaway, and Jonathan Bamber

Modelling spatio-temporal data on a large scale presents a number of obstacles for statisticians and environmental scientists. Issues such as computational complexity, combining point and areal data, separation of sources into their component processes, and the handling of both large volumes of data in some areas and sparse data in others must be considered. We discuss methods to overcome such challenges within a Bayesian hierarchical modelling framework using INLA.

In particular, we illustrate the approach using the example of source-separation of geophysical signals both on a continental and global scale. In such a setting, data tends to be available both at a local and areal level. We propose a novel approach for integrating such sources together using the INLA-SPDE method, which is normally reserved for point-level data. Additionally, the geophysical processes involved are both spatial (time-invariant) and spatio-temporal in nature. Separation of such processes into physically sensible components requires careful modelling and consideration of priors (such as physical model outputs where data is sparse), which will be discussed. We also consider methods to overcome the computational costs of modelling on such a large scale, from efficient mesh design, to thinning/aggregating of data, to considering alternative approaches for inference. This holistic approach to modelling of large-scale data ensures that spatial and spatio-temporal processes can be sensibly separated into their component parts, without being prohibitively expensive to model.

How to cite: Brady, A., Rougier, J., Ziegler, Y., Vishwakarma, B. D., Royston, S., Chuter, S., Westaway, R., and Bamber, J.: Overcoming challenges in spatio-temporal modelling of large-scale (global) data, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-13357,, 2021.

Zhuojing tian, Zhenchun huang, Yinong zhang, Yanwei zhao, En fu, and Shuying wang

Abstract: As the amount of data and computation of scientific workflow applications continue to grow, distributed and heterogeneous computing infrastructures such as inter-cloud environments provide this type of application with a great number of computing resources to meet corresponding needs. In the inter-cloud environment, how to effectively map tasks to cloud service providers to meet QoS(quality of service) constraints based on user requirements has become an important research direction. Remote sensing applications need to process terabytes of data each time, however frequent and huge data transmission across the cloud will bring huge performance bottlenecks for execution, and seriously affect the result of QoS constraints such as makespan and cost. Using a data transformation graph(DTG) to study the data transfer process of global drought detection application, the specific optimization strategy is obtained based on the characteristics of application and environment, and according to this, one inter-cloud workflow scheduling method based on genetic algorithm is proposed, under the condition of satisfying the user’s QoS constraints, the makespan the cost can be minimized. The experimental results show that compared with the standard genetic algorithm, random algorithm, random algorithm, and round-robin algorithm, the optimized genetic algorithm can greatly improve the scheduling performance of data computation-intensive scientific workflows such as remote sensing applications and reduce the impact of performance bottlenecks.

Keywords: scientific workflow scheduling; inter-cloud environment; remote sensing application; data transformation graph;

How to cite: tian, Z., huang, Z., zhang, Y., zhao, Y., fu, E., and wang, S.: Scientific workflow scheduling based on data transformation graph for remote sensing application, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-1255,, 2021.

Nastasija Grujić, Sanja Brdar, Olivera Novović, Nikola Obrenović, Miro Govedarica, and Vladimir Crnojević

Understanding human dynamics is of crucial importance for managing human activities for sustainable development. According to the United Nations, 68% of people will live in cities by 2050. Therefore, it is important to understand human footprints in order to develop policies that will improve the lives in urban and suburban areas. Our study aims at detecting spatial-temporal activity patterns from mobile phone data provided by a telecom service provider. To be more precise we used the activity data set which contains the amount of sent/received SMS, calls, as well as internet usage per radio-base station in defined time-stamps. The case study focus is on the capital city of Serbia, Belgrade, which has have nearly 2 million inhabitants and included the month of February 2020 in the analysis. We applied the biclustering (spectral co-clustering) algorithm on the telecom data to detect locations in the city that behave similarly in the specific time windows. Biclustering is a data mining technique that is being used for finding homogeneous submatrices among rows and columns of a matrix, widely used in text mining and gene expression data analysis.  Although, there are no examples in the literature of the algorithm usage on location-based data for urban application, we have seen the potential due to its ability to detect clusters in a more refined way, during a specific period of time that could not otherwise be detected with global clustering approach. To prepare the data for the algorithm appliance, we normalized each type of activity (SMS/Call In/Out and Internet activity) and aggregated the total activity on each antenna per hour. We transformed the data into the matrix, where rows were presenting the antennas, and columns the hours. The algorithm was applied for each day separately. On average number of discovered biclusters was 5, usually corresponding to regular based activities, such as work, home, commuting, and free time, but also to the city’s nightlife. Our results confirmed that urban spaces are the function of space and time. They revealed different functionalities of the urban and suburban parts in the city. We observed the presence of patterned behavior across the analyzed days. The type of day dictated the spatial-temporal activities that occurred. We distinguished different types of days, such as working days (Monday to Thursday), Fridays, weekends, and holidays. These findings showed the promising potential of the biclustering algorithm and could be utilized by policymakers for precisely detecting activity clusters across space and time that correspond to specific functions of the city.

How to cite: Grujić, N., Brdar, S., Novović, O., Obrenović, N., Govedarica, M., and Crnojević, V.: Biclustering for uncovering spatial-temporal patterns in telecom data, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-14423,, 2021.

Lili Czirok, Lukács Kuslits, and Katalin Gribovszki

The SE-Carpathians produce significant geodynamic activity due to the current subduction process. The strong seismicity in the Vrancea-zone is its most important indicator. The focus area of these seismic events is relatively small, around 80*100 km and the distribution of their locations is quiet dense.

The authors have carried out cluster analyses of the focal mechanism solutions estimated from local and tele-seismic measurements and stress inversions to support the recent and previously published studies in this region. They have applied different pre-existing clustering methods – e.g. HDBSCAN (hierarchical density-based clustering for applications with noise) and agglomerative hierarchical analysis – considering to the geographical coordinates, focal depths and parameters of the focal mechanism solutions of the used seismic events, as well. Moreover, they have attempted to improve a fully-automated algorithm for the classification of the earthquakes for the estimations. This algorithm does not call for the setting of hyper-parameters, thus the affection of the subjectivity can be reduced significantly and the running time can be also decreased. In all cases, the resulted stress tensors are in close agreement with the earlier presented results.

How to cite: Czirok, L., Kuslits, L., and Gribovszki, K.: Cluster analysis in the studying of stress relation in the Vrancea-zone, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-9749,, 2021.

Christina Pacher, Irene Schicker, Rosmarie DeWit, and Claudia Plant
Both clustering and outlier detection play an important role in meteorology. With clustering large sets of data points, such as numerical weather predicition (NWP) model data or observation sites, are separated into groups based on the characteristics found in the data grouping similar data points in a cluster. Clustering enables one, too, to detect outliers in the data. The resulting clusters are useful in many ways such as atmospheric pattern recognition (e.g. clustering NWP ensemble predictions to estimate the likelihood of the predicted weather patterns), climate applications (grouping point observations for climate pattern recognition), forecasting(e.g. data pool enhancement using data of similar sites for forecasting applications), in urban meteorology, air quality, renewable energy systems, and hydrologogical applications.  
Typically, one does not know in advance how many clusters or groups are present in the data. However, for algorithms such as K-means one needs to define how many clusters one wants to have as an outcome. With the proposed novel algorithm AWT,  a modified combination of several well-known clustering algorithms, this is not needed. It chooses the number of clusters automatically based on a user-defined threshold parameter. Furthermore, the algorithm can be used for heterogeneous meteorological input data as well as data sets that exceed the available memory size.
Similar as the classical BIRCH algorithm, our method AWT works on a multi-resolution data structure, an Aggregated Wavelet Tree that is suitable for representing multivariate time series. In contrast to BIRCH, the user does not need to specify the number of clusters K, as that is difficult in our application. Instead, AWT relies on a single threshold parameter for clustering and outlier detection. This threshold corresponds to the highest resolution of the tree. Points that are not in any cluster with respect to the threshold are naturally flagged as outliers.
With the recent increasing usage of non-traditional data sources, such as private, smart-home weather station, in NWP  models and other forecasting and applications outlier and clustering methods are useful in pre-processing and filtering these rather novel data sources. Especially in urban areas changes in the surface energy balance caused by urbanization result in temperatures generally being higher in cities than in the surrounding areas. In order to capture the spatial features of this effect data with high spatial resoltion are necessary. Here, these privately owned smart-home weather stations are useful as often only a limited number of official observation sites exist. However, to be able to use these data they need to be pre-processed.  
In this work we apply our novel algorithm AWT to crowdsourced data from the city of Vienna. We demonstrate the skill of the algorithm in outlier detection and filtering as well as clustering the data and evaluate it against commonly used algorithms. Furthermore, we show how one could use the algorithm in renewable energy applications.

How to cite: Pacher, C., Schicker, I., DeWit, R., and Plant, C.: AWT - Clustering using an Aggregated Wavelet Tree: A novel automatic unsupervised clustering and outlier detection algorithm for time series, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-12324,, 2021.

Wenxuan Hu, Yvonne Scholz, Madhura Yeligeti, Lüder von Bremen, and Marion Schroedter-Homscheidt

Renewable energy sources such as wind energy play a crucial role in most climate change mitigation scenarios because of their ability to significantly reduce energy-related carbon emissions. In order to understand and design future energy systems, detailed modeling of renewable energy sources is important. In the light of making energy system modelling possible at all variability scales of local weather conditions, renewable energy source information with high resolution in both space and time are required.

Nowadays, renewable energy resources data that are widely used among the energy modeling community are reanalysis data such as ERA5, COSMO REA6, and MERRA2. Taking wind speed as an example, reanalysis data can provide long term spatially resolved wind information on any desired height in a physically consistent way. However, their spatial resolution is coarse. In order to obtain a fine spatial resolution data focusing on wind speed, this paper proposes a statistical downscaling method based on reanalysis data, observation data, and the local topography.

While most statistical wind downscaling studies have focused on obtaining site specific data or downscaling probability density functions, this paper focuses on downscaling one-year hourly wind speed time series for Europe to 0.00833 degree X 0.00833 degree (approximately 1km X 1km) resolution. It has been proven by various studies that the local topography influences wind speed. The topographic structure in this study is determined by two metrics: TPI, a topographic position index that compares the elevation of each cell to the mean elevation of the neighborhood areas and Sx, a slope-based, direction-dependent parameter that describes the topography in the upwind direction. The observation data used in this study are MeteoSwiss measurement values which provide the hourly wind speed time series at the station heights. For each weather station with observation data, biases described by the local terrain features are introduced to minimize the root mean square error (RMS) and Kolmogorov-Smirnov D (KSD) statistic between the corrected and the observed wind speed. These biases are then assigned to grid points with the same terrain types as the weather station, which enables downscaling of the wind speed for whole Europe.

The results show that this downscaling method can improve the RMS and KSD for both ERA5 and COSMO REA6, especially at mountain ridges, which indicates that it can not only decrease the bias, but also provide a better match to the observed wind speed distributions.

How to cite: Hu, W., Scholz, Y., Yeligeti, M., von Bremen, L., and Schroedter-Homscheidt, M.: Statistical downscaling of wind speed time series data based on topographic variables, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-12734,, 2021.

Uldis Zandovskis, Bruce D. Malamud, and Davide Pigoli

Natural hazards are inherently spatio-temporal processes. Spatio-temporal clustering methodologies applied to natural hazard data can help distinguish clustering patterns that would not only identify point-event dense regions and time periods, but also provide insight into the hazardous process. Here we review spatio-temporal clustering methodologies applicable to point event datasets representative of natural hazards and we evaluate their performance using both synthetic and real life data. We first present a systematic overview of major spatio-temporal clustering methodologies used in the literature, which include clustering procedures  that are (i) global (providing a single quantitative measure of the degree of clustering in the dataset) and (ii) local (i.e. assigning individual point events to a cluster). A total of seven methodologies from these two groups of clustering procedures are applied to real-world (lightning) and synthetic datasets. For (i) global procedures, we explore Knox, Mantel, Jacquez k-NN tests and spatio-temporal K-functions and for (ii) local procedures we consider spatio-temporal scan statistic, kernel density estimation and density-based clustering method OPTICS. The dataset of 7021 lightning strikes is from 1 and 2 July 2015 over the UK, when a severe three-storm system crossed the region with different convective modes producing each of the storms. The synthetic datasets are representative of various topologies of a point-event natural hazard data with a moving source. We introduce a two-source model with input parameters related to the physical properties of the source. Each source has a set number of points events, initiation point in space and time, movement speed, direction, inter-event time distribution and spatial spread distribution. In addition to a base model of two identical moving sources with a set temporal separation, we produce four different topologies of the data by incrementally varying the speed parameter of the source, spatial spread parameters, direction and initiation points, and angle of two sources. With these five synthetic datasets representative of various two-source models, we evaluate the performance of the methodologies. The performance is assessed based on the ability of each methodology to separate the point events produced by the two sources and the sensitivity of these results to changes in the model input parameters. We further discuss the benefits of combining global and local clustering procedures in the analyses as we gain an initial understanding of the spatial and temporal scales over which clustering is present in the data by using global clustering procedures. This information then helps to inform and limit the choice of input parameters for the local clustering procedures.

How to cite: Zandovskis, U., Malamud, B. D., and Pigoli, D.: Spatio-temporal clustering methodologies for point-event natural hazards, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-13210,, 2021.

Andrea Trucchia, Sara Isnardi, Mirko D'Andrea, Guido Biondi, Paolo Fiorucci, and Marj Tonini

Wildfires constitute a complex environmental disaster triggered by several interacting natural and human factors that can affect the biodiversity, species composition and ecosystems, but also human lives, regional economies and environmental health. Therefore, wildfires have become the focus on forestry and ecological research and are receiving considerable attention in forest management. Current advances in automated learning and simulation methods, like machine learning (ML) algorithms, recently aroused great interest in wildfires risk assessment and mapping. This quantitative evaluation is carried out by taking into account two factors: the location and spatial extension of past wildfires events and the geo-environmental and anthropogenic predisposing factors that favored their ignition and spreading. When dealing with risk assessment and predictive mapping for natural phenomena, it is crucial to ascertain the reliability and validity of collected data, as well as the prediction capability of the obtained results. In a previous study (Tonini et al. 2020) authors applied Random Forest (RF) to elaborate wildfire susceptibility mapping for Liguria region (Italy). In the present study, we address to the following outstanding issues, which are still unsolved: (1) the vegetation map included a class labeled “burned area” that masked to true burned vegetation; (2) the implemented model based on RF gave good results, but it needs to be compared with other ML based approaches; (3) to test the predictive capabilities of the model, the last three years of observations were taken, but these are not fully representative of different wildfires regimes, characterizing non-consecutives years. Thus, by improving the analyses, the following results were finally achieved. 1) the class “burned areas” has been reclassified based on expert knowledge, and the type of vegetation correctly assigned. This allowed correctly estimating the relative importance of each vegetation class belonging to this variable. (2) Two additional ML based approach, namely Multi-Layer Perceptron (MLP) and Support Vector Machine (SVM), were tested besides RF and the performance of each model was assessed, as well as the resulting variable ranking and the predicting outputs. This allowed comparing the three ML based approaches and evaluating the pros and cons of each one. (3) The training and testing dataset were selected by extracting the yearly-observations based on a clustering procedure, allowing accounting for the temporal variability of the burning seasons. As result, our models can perform on average better prediction in different situations, by taking into considering years experiencing more or less wildfires than usual. The three ML-based models (RF, SVM and MLP) were finally validated by means of two metrics: i) the Area Under the ROC Curve, selecting the validation dataset by using a 5-folds cross validation procedure; ii) the RMS errors, computed by evaluating the difference between the predicted probability outputs and the presence/absence of an observed event in the testing dataset.


Tonini, M.; D’Andrea, M.; Biondi, G.; Degli Esposti, S.; Trucchia, A.; Fiorucci, P. A Machine Learning-Based Approach for Wildfire Susceptibility Mapping. The Case Study of the Liguria Region in Italy. Geosciences 202010, 105.

How to cite: Trucchia, A., Isnardi, S., D'Andrea, M., Biondi, G., Fiorucci, P., and Tonini, M.: Wildfire susceptibility assessment: evaluation of the performance of different machine learning algorithms, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-7162,, 2021.

Giovanni Marchisio, Patrick Helber, Benjamin Bischke, Tim Davis, and Annett Wania

New catalogues of nearly daily or even intraday temporal data will soon dominate the global archives. However, there has been little exploration of artificial intelligence (AI) techniques to leverage the high cadence that is already possible to achieve through the fusion of multiscale, multimodal sensors. Under the sponsorship of the European Union’s Horizon 2020 programme, RapidAI4EO will establish the foundations for the next generation of Copernicus Land Monitoring Service (CLMS) products. Focus is on the CORINE Land Cover programme, which is the flagship of CLMS. 

Specific objectives of the project are to: 1) explore and stimulate the development of new spatiotemporal monitoring applications based on the latest advances in AI and Deep Learning (DL); 2) demonstrate the fusion of Copernicus high resolution satellite imagery and third party very high resolution imagery; 3) provide intensified monitoring of Land Use and Land Cover, and Land Use change at a much higher level of detail and temporal cadence than it is possible today. 

Our strategy is two-fold. The first aspect involves developing vastly improved DL architectures to model the phenomenology inherent in high cadence observations with focus on disentangling phenology from structural change. The second involves providing critical training data to drive advancement in the Copernicus community and ecosystem well beyond the lifetime of this project. To this end we will create the most complete and dense spatiotemporal training sets ever, combining Sentinel-2 with daily, harmonized, cloud-free, gap filled, multispectral 3m time series resulting from fusion of open satellite data with Planet imagery at as many as 500,000 patch locations over Europe. The daily time series will span the entire year 2018, to coincide with the latest release of CORINE. We plan to open source these datasets for the benefit of the entire remote sensing community.

This talk focuses on the description of the datasets whose inspirations comes from the recently released EuroSAT (Helbert et al, 2019) and BigEarthNet corpora (Sumbul et al, 2019). The new corpora will look at the intersection of CORINE 2018 with all the countries in the EU, balancing relative country surface with relative LULC distribution and most notably adding the daily high resolution time series at all locations for the year 2018. Annotations will be based on the CORINE ontology. The higher spatial resolution will support modeling of more LC classes, while the added  temporal dimension should enable disambiguation of land covers across diverse climate zones, as well as an improved understanding of land use.

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 101004356.

How to cite: Marchisio, G., Helber, P., Bischke, B., Davis, T., and Wania, A.: RapidAI4EO: Advancing the State-of-the-Art in Continuous Land Monitoring, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-13713,, 2021.

Andre P. Silva, Filip Thorn, Damaris Zurell, and Juliano Cabral

Land-use change remains the main driver of biodiversity loss, and fragmentation and habitat loss are expected to lead to further population declines and species losses . We apply a recently developed R package for a spatially-explicit mechanistic simulation model (RangeShiftR), which incorporates habitat suitability, demographic as well as dispersal processes to understand temporal effects of land-use change (Land-use harmonization scenarios for the 1900-2100 period) on abundance and richness of mammalian species in South-Asia. We then compare land-use scenarios with and without protected areas to understand if current spatial conservation strategies are able to sustain viable populations independently of the land-use scenarios followed. Our approach is innovative in assessing how land-use scenarios can influence animal populations through underlying ecological processes.

How to cite: P. Silva, A., Thorn, F., Zurell, D., and Cabral, J.: Land-use change effects on biodiversity through mechanistic simulations: A case study with South-Asian mammals, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-16051,, 2021.

Part II - Innovative Evaluation Frameworks and Platforms for Weather and Climate Research
Tara Jensen, Marion Mittermaier, Paul Kucera, and Barbara Brown

Verification and validation activities are critical for the success of modeling and prediction efforts at organizations around the world.  Having reproducible results via a consistent framework is equally important for model developers and users alike.  The Model Evaluation Tools (MET) was developed over a decade ago and expanded to the METplus framework with a view towards providing a consistent platform delivering reproducible results.   

The METplus system is an umbrella verification, validation and diagnostic framework for use by thousands of users from both US and international organizations.  These tools are designed to be highly flexible to allow for quick adaption to meet additional evaluation and diagnostic needs.  A suite of python wrappers have been implemented to facilitate a quick set-up and implementation of the system, and to enhance the pre-existing plotting capabilities.  Recently, several organizations within the National Oceanic and Atmospheric Adminstration (NOAA), the United States Department of Defense (DOD), and international partnerships such as Unified Model (UM) Partnership led by the Met Office have adopted the tools for their use both operationally and for research purposes.  Many of these organizations are also now contributing to METplus development, leading to a more robust and dynamic framework for the entire earth system modeling community to use.

This presentation will provide an update on the current status of METplus and how it is being used in across multiple scales and applications.  It will highlight examples of METplus applied to verification and validation efforts throughout the international community to address a range of temporal (hourly forecasts to subseasonal-to-seasonal) and spatial scales (convection allowing to mesoscale, regional to global, tropical to cryosphere to space).

How to cite: Jensen, T., Mittermaier, M., Kucera, P., and Brown, B.: Fostering International Collaboration Through a Unified Verification, Validation, and Diagnostics Framework - METplus, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-13903,, 2021.

Etor E. Lucio-Eceiza, Christopher Kadow, Martin Bergemann, Mahesh Ramadoss, Sebastian Illing, Oliver Kunst, Thomas Schartner, Jens Grieger, Mareike Schuster, Andy Richling, Ingo Kirchner, Henning Rust, Philipp Sommer, Ulrich Cubasch, Uwe Ulbrich, Hannes Thiemann, and Thomas Ludwig

The Free Evaluation System Framework (Freva - , , - is a software infrastructure for standardized data and tool solutions in Earth system science. Freva runs on high performance computers (HPC) to handle customizable evaluation systems of research projects, institutes or universities. It combines different software technologies into one common hybrid infrastructure, where all its features are accessible via shell and web environment. Freva indexes different data projects into one common search environment by storing the metadata information of the self-describing model, reanalysis and observational data sets in a database. The database interface satisfies the international standards provided by the Earth System Grid Federation (ESGF). This implemented metadata system with its advanced but easy-to-handle search tool supports users, developers and their plugins to retrieve the required information. A generic application programming interface (API) allows scientific developers to connect their analysis tools with the evaluation system independently of the programming language used. Facilitation of the provision and usage of tools and climate data automatically increases the number of scientists working with the data sets and identifying discrepancies. Plugins are also able to integrate their e.g. post-processed results into the database of the user. This allows e.g. post-processing plugins to feed statistical analysis plugins, which fosters an active exchange between plugin developers of a research project. Additionally, the history and configuration sub-system stores every analysis performed with the evaluation system in a database. Configurations and results of the tools can be shared among scientists via shell or web system. Therefore, plugged-in tools benefit from transparency and reproducibility. Furthermore, the system suggests existing results already produced by other users – saving CPU hours, I/O, disk space and time. An integrated web shell (shellinabox) adds a degree of freedom in the choice of the working environment and can be used as a gate to the research projects on a HPC. Freva efficiently frames the interaction between different technologies thus improving the Earth system modeling science. New Features and aspects of further development and collaboration are discussed.

How to cite: Lucio-Eceiza, E. E., Kadow, C., Bergemann, M., Ramadoss, M., Illing, S., Kunst, O., Schartner, T., Grieger, J., Schuster, M., Richling, A., Kirchner, I., Rust, H., Sommer, P., Cubasch, U., Ulbrich, U., Thiemann, H., and Ludwig, T.: Free Evaluation System Framework (Freva) - New Features and Development, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-4468,, 2021.

Chloé Radice, Hélène Brogniez, Pierre-Emmanuel Kirstetter, and Philippe Chambon

Remote sensing data are often used to assess model forecasts on multiple scales, generally by confronting past simulations to observations. This paper introduces a novel probabilistic  method that evaluates  tropical atmospheric relative humidity (RH) profiles simulated by the global numerical model for weather forecasting ARPEGE  with respect to probability distributions of finer scale satellite observations.   

The reference RH is taken from the SAPHIR microwave sounder onboard the Megha-Tropiques satellite in operations since 2011. ARPEGE simulates the RH field every 6h hours on a 0.25° grid over 18 vertical levels ranging between 950hPa and 100hPa. The reference probabilistic RH field is retrieved from brightness temperatures measured by SAPHIR with a footprint resolution ranging from 10 km (nadir) to 23 km (edge of swath) on 6 vertical layers, ranging from 950hPa to 100hPa. Footprint scale RH are aggregated (convoluted) over the spatial and temporal scale of comparison to match the model resolution and summarize the patterns over a significant period. Comparison  results will be shown over the April-May-June 2018 period for two configurations of the ARPEGE model (two parametrization schemes for convection). The probabilistic comparison is discussed with respect to a classical deterministic comparison of RH values.

This probabilistic approach allows to keep all the sub-grid information and, by looking at the distribution as a whole, avoids the classical determinist simplification that consists of working with a simple “best” estimate. This method allows a finer assessment by working on a case-by-case basis and allowing a characterisation of specific situations. It provides an added-value by accounting for  additional information in the evaluation of the simulated field, especially for model simulations that are close to the traditional mean.

How to cite: Radice, C., Brogniez, H., Kirstetter, P.-E., and Chambon, P.: Novel assessment of model relative humidity with satellite probabilistic estimates, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-4918,, 2021.

Bouwe Andela, Fakhereh Alidoost, Lukas Brunner, Jaro Camphuijsen, Bas Crezee, Niels Drost, Bettina Gier, Birgit Hassler, Peter Kalverla, Axel Lauer, Saskia Loosveldt-Tomas, Ruth Lorenz, Valeriu Predoi, Mattia Righi, Manuel Schlund, Stef Smeets, Javier Vegas-Regidor, Jost Von Hardenberg, Katja Weigel, and Klaus Zimmermann

The Earth System Model Evaluation Tool (ESMValTool) is a free and open-source community diagnostic and performance metrics tool for the evaluation of Earth system models such as those participating in the Coupled Model Intercomparison Project (CMIP). Version 2 of the tool (Righi et al. 2020, features a brand new design composed of a core that finds and processes data according to a ‘recipe’ and an extensive collection of ready-to-use recipes and associated diagnostic codes for reproducing results from published papers. Development and discussion of the tool (mostly) takes place in public on and anyone with an interest in climate model evaluation is welcome to join there.


Since the initial release of version 2 in the summer of 2020, many improvements have been made to the tool. It is now more user friendly with extensive documentation available on and a step by step online tutorial. Regular releases, currently planned three times a year, ensure that recent contributions become available quickly while still ensuring a high level of quality control. The tool can be installed from conda, but portable docker and singularity containers are also available.


Recent new features include a more user-friendly command-line interface, citation information per figure including CMIP6 data citation using ES-DOC, more and faster preprocessor functions that require less memory, automatic corrections for a larger number of CMIP6 datasets, support for more observational and reanalysis datasets, and more recipes and diagnostics.


The tool is now also more reliable, with improved automated testing through more unit tests for the core, as well as a recipe testing service running at DKRZ for testing the scientific recipes and diagnostics that are bundled into the tool. The community maintaining and developing the tool is growing, making the project less dependent on individual contributors. There are now technical and scientific review teams that review new contributions for technical quality and scientific correctness and relevance respectively, two new principal investigators for generating a larger support base in the community, and a newly created user engagement team that is taking care of improving the overall user experience.

How to cite: Andela, B., Alidoost, F., Brunner, L., Camphuijsen, J., Crezee, B., Drost, N., Gier, B., Hassler, B., Kalverla, P., Lauer, A., Loosveldt-Tomas, S., Lorenz, R., Predoi, V., Righi, M., Schlund, M., Smeets, S., Vegas-Regidor, J., Von Hardenberg, J., Weigel, K., and Zimmermann, K.: Recent developments on the Earth System Model Evaluation Tool, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-3476,, 2021.

Peter C. Kalverla, Stef Smeets, Niels Drost, Bouwe Andela, Fakhereh Alidoost, and Jaro Camphuijsen

Ease of use can easily become a limiting factor to scientific quality and progress. In order to verify and build upon previous results, the ability to effortlessly access and process increasing data volumes is crucial.

To level the playing field for all researchers, a shared infrastructure had to be developed. In Europe, this effort is coordinated mainly through the IS-ENES projects. The current infrastructure provides access to the data as well as compute resources. This leaves the tools to easily work with the data as the main obstacle for a smooth scientific process. Interestingly, not the scarcity of tools, but rather their abundance can lead to diverging workflows that hamper reproducibility.

The Earth System Model eValuation Tool (ESMValTool) was originally developed as a command line tool for routine evaluation of important analytics workflows. This tool encourages some degree of standardization by factoring out common operations, while allowing for custom analytics of the pre-processed data. All scripts are bundled with the tool. Over time this has grown into a library of so-called ‘recipes’.

In the EUCP project, we are now developing a Python API for the ESMValTool. This allows for interactive exploration, modification, and execution of existing recipes, as well as creation of new analytics. Concomitantly, partners in IS-ENES3 are making their infrastructure accessible through JupyterLab. Through the combination of these technologies, researchers can easily access the data and compute, but also the workflows or methods used by their colleagues - all through the web browser. During the vEGU, we will show how this extended infrastructure can be used to easily reproduce, and build upon, previous results.

How to cite: Kalverla, P. C., Smeets, S., Drost, N., Andela, B., Alidoost, F., and Camphuijsen, J.: Bringing ESMValTool to the Jupyter Lab, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-4805,, 2021.

Lisa Bock, Birgit Hassler, and Axel Lauer and the ESMValTool Develpoment Team

The Earth System Model Evaluation Tool (ESMValTool) has been developed with the aim of taking model evaluation to the next level by facilitating analysis of many different ESM components, providing well-documented source code and scientific background of implemented diagnostics and metrics and allowing for traceability and reproducibility of results (provenance). This has been made possible by a lively and growing development community continuously improving the tool supported by multiple national and European projects. The latest major release (v2.0) of the ESMValTool has been officially introduced in August 2020 as a large community effort, and since then several additional smaller releases have followed.

The diagnostic part of the ESMValTool includes a large collection of standard “recipes” for reproducing peer-reviewed analyses of many variables across ESM compartments including atmosphere, ocean, and land domains, with diagnostics and performance metrics focusing on the mean-state, trends, variability and important processes, phenomena, as well as emergent constraints. While most of the diagnostics use observational data sets (in particular satellite and ground-based observations) or reanalysis products for model evaluation some are also based on model-to-model comparisons. This presentation gives an overview on the latest scientific diagnostics and metrics added during the last year including examples of applications of these diagnostics to CMIP6 model data.

How to cite: Bock, L., Hassler, B., and Lauer, A. and the ESMValTool Develpoment Team: New scientific diagnostics in the ESMValTool – an overview, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-7724,, 2021.

Jerome Servonnat, Eric Guilyardi, Zofia Stott, Kim Serradell, Axel Lauer, Klaus Zimmerman, Fanny Adloff, Marie-Claire Greening, Remi Kazeroni, and Javier Vegas

Developing an Earth system model evaluation tool for a broad user community is a real challenge, as the potential users do not necessarily have the same needs or expectations. While many evaluation tasks across user communities include common steps, significant differences are also apparent, not least the investment by institutions and individuals in bespoke tools. A key question is whether there is sufficient common ground to pursue a community tool with broad appeal and application.

We present the main results of a survey carried out by Assimila for the H2020 IS-ENES3 project to review the model evaluation needs of European Earth System Modelling communities. Interviewing approximately 30 participants among several European institutions, the survey targeted a broad range of users, including model developers, model users, evaluation data providers, and infrastructure providers. The output of the study provides an analysis of  requirements focusing on key technical, standards, and governance aspects.

The study used ESMValTool as a  current benchmark in terms of European evaluation tools. It is a community diagnostics and performance metrics tool for the evaluation of Earth System Models that allows for comparison of single or multiple models, either against predecessor versions or against observations. The tool is being developed in such a way that additional analyses can be added. As a community effort open to both users and developers, it encourages open exchange of diagnostic source code and evaluation results. It is currently used in Coupled Model Intercomparison Projects as well as for the development and testing of “new” models.

A key result of the survey is the widespread support for ESMValTool amongst users, developers, and even those who have taken or promote other approaches. The results of the survey identify priorities and opportunities in the further development of the ESMValTool to ensure long-term adoption of the tool by a broad community.

How to cite: Servonnat, J., Guilyardi, E., Stott, Z., Serradell, K., Lauer, A., Zimmerman, K., Adloff, F., Greening, M.-C., Kazeroni, R., and Vegas, J.: Model evaluation expectations of European ESM communities: results from a survey, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-15681,, 2021.