Comprehensive evaluations of Earth Systems Science Prediction (ESSP) systems (e.g., numerical weather prediction, hydrologic prediction, climate prediction and projection, etc.) are essential to understand sources of prediction errors and to improve earth system models. However, numerous roadblocks limit the extent and depth of ESSP system performance evaluations. Observational data used for evaluation are often not representative of the physical structures that are being predicted. Satellite and other large spatial and temporal observations datasets can help provide this information, but the community lacks tools to adequately integrate these large datasets to provide meaningful physical insights on the strengths and weaknesses of predicted fields. ESSP system evaluations also require large storage volumes to handle model simulations, large spatial datasets, and verification statistics which are difficult to maintain. Standardization, infrastructure, and communication in one scientific field is already a challenge. Bridging different communities to allow knowledge transfers, is even harder. The development of innovative methods in open frameworks and platforms is needed to enable meaningful and informative model evaluations and comparisons for many large Earth science applications from weather to climate.

The purpose of this Open Science 2.0 session is to bring experts together to discuss innovative methods for integrating, managing, evaluating, and disseminating information about the quality of ESSP fields in meaningful way. Presentations of these innovative methods applied to Earth science applications is encouraged. The session should generate some interest in communities and research projects building and maintaining these systems (e.g. ESMVal, Copernicus, Climaf, Freva, Birdhouse, MDTF, UV-CDAT, CMEC - PCMDI Metrics Package, Doppyo, MET-TOOLS, CDO, NCO, etc.). The session allows room for the exchange of ideas. An outcome of this session is to connect the scientists, develop a list of tools and techniques that could be developed and provided to the community in the future.

Co-organized by AS5/CL5
Convener: Christopher KadowECSECS | Co-conveners: Paul Kucera, Jerome Servonnat
| Attendance Fri, 08 May, 16:15–18:00 (CEST)

Files for download

Download all presentations (114MB)

Chat time: Friday, 8 May 2020, 16:15–18:00

Chairperson: Christopher, Jerome, Paul
D686 |
| Highlight
Sathyaseelan Mayilvahanam, Sanjay Kumar Ghosh, and Chandra Shekhar Prasad Ojha


In general, modelling the climate change and its impacts within a hydrological unit brings out an understanding of the system and, its behaviour with various model constrains. The climate change and global warming studies are being under research and development phase, because of its complex and dynamic nature. The IPCC 5th Assessment Report on global warming states that in the 21st century, there may be an increase in temperature of the order of ~1.5°C. This transient climate may cause significant impacts or any discrepancies in the water availability of the hydrological unit. This may lead to severe impacts in countries with high population such as India, China, etc., The Remote sensing datasets play an essential role in modelling the climatic changes for a river basin at different spatial and temporal scales. This study aims to propose a conceptual framework for the above-defined problem with emphasising on remote sensing datasets. This framework involves five entities such as the data component, process component,  impact component,  feedback component and, uncertainty component. The framework flow begins with the data component entity that involves two significant inputs, such as the hydro-meteorological data and the land-hydrology data. The essential attributes of the hydro-meteorological data entities are the precipitation, temperature, relative humidity, wind speed and solar radiation. These datasets may be obtained and analysed from empirical or statistical methods, in-situ based or satellite-based methods, respectively. These mathematical models on long-run historical climate data may provide knowledge on climate change detections or its trends. The meteorological data derived from the satellites may have a measurable bias with that of the in situ data. The satellite-based land-hydrology data component involves various attributes such as topography, soil, vegetation, water bodies, other land use / land cover, soil moisture, evapotranspiration. The process component involves complex land-hydrology processes that may be well established and modelled by customizable hydrological models. Here, we may emphasise the use of remote-sensing based model parameter values in the equations either directly or indirectly. Also, the land-atmospheric process component involves various complex processes that may take place in this zone. These processes may be well established and solved by customizable atmospheric weather models. The land components play a significant role in modelling the climate changes, because these land processes may trigger global warming by various anthropogenic agents. The main objective of this framework is to emphasise the climate change impacts using remote sensing. Hence, the impact component entity plays an essential role in this conceptual framework. The climate change impact within a river basin at various spatial and temporal scales are identified using different hydrological responses. The feedback entity is the most sensitive part of this framework, because it may alter the climate forcing either positive or negative. An uncertainty model component handles the uncertainty in the model framework. The highlight of this conceptual framework is to use the remote sensing datasets in climate change studies. The limitations on the correctness of the remote sensing data with the insitu data at every location is not feasible.

How to cite: Mayilvahanam, S., Ghosh, S. K., and Ojha, C. S. P.: A Conceptual Framework for Modelling the Climate Change and its Impacts within a River Basin using Remote Sensing data, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-729, https://doi.org/10.5194/egusphere-egu2020-729, 2020.

D687 |
Carsten Ehbrecht, Stephan Kindermann, Ag Stephens, and David Huard

The Web Processing Service (WPS) is an OGC interface standard to provide processing tools as Web Service.
The WPS interface standardizes the way processes and their inputs/outputs are described,
how a client can request the execution of a process, and how the output from a process is handled.

Birdhouse tools enable you to build your own customised WPS compute service
in support of remote climate data analysis.

Birdhouse offers you:

  • A Cookiecutter template to create your own WPS compute service.
  • An Ansible script to deploy a full-stack WPS service.
  • A Python library, Birdy, suitable for Jupyter notebooks to interact with WPS compute services.
  • An OWS security proxy, Twitcher, to provide access control to WPS compute services.

Birdhouse uses the PyWPS Python implementation of the Web Processing Service standard.
PyWPS is part of the OSGeo project.

The Birdhouse tools are used by several partners and projects.
A Web Processing Service will be used in the Copernicus Climate Change Service (C3S) to provide subsetting
operations on climate model data (CMIP5, CORDEX) as a service to the Climate Data Store (CDS).
The Canadian non profit organization Ouranos is using a Web Processing Service to provide climate indices
calculation to be used remotely from Jupyter notebooks.

In this session we want to show how a Web Processing Service can be used with the Freva evaluation system.
Freva plugins can be made available as processes in a Web Processing Service. These plugins can be run
using a standard WPS client from a terminal and Jupyter notebooks with remote access to the Freva system.

We want to emphasise the integrational aspects of the Birdhouse tools: supporting existing processing frameworks
to add a standardized web service for remote computation.


  • http://bird-house.github.io
  • http://pywps.org
  • https://www.osgeo.org/
  • http://climate.copernicus.eu
  • https://www.ouranos.ca/en
  • https://freva.met.fu-berlin.de/

How to cite: Ehbrecht, C., Kindermann, S., Stephens, A., and Huard, D.: Building Web Processing Services with Birdhouse, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-1612, https://doi.org/10.5194/egusphere-egu2020-1612, 2020.

D688 |
| Highlight
Clara Burgard, Dirk Notz, Leif T. Pedersen, and Rasmus T. Tonboe

The diversity in sea-ice concentration observational estimates retrieved from brightness temperatures measured from space is a challenge for our understanding of past and future sea-ice evolution as it inhibits reliable climate model evaluation and initialisation. To address this challenge, we introduce a new tool: the Arctic Ocean Observation Operator (ARC3O). 

ARC3O allows us to simulate brightness temperatures at 6.9 GHz at vertical polarisation from standard output of an Earth System Model to be compared to observations from space at this frequency. We use simple temperature and salinity profiles inside the snow and ice column based on the output of the Earth System Model to compute these brightness temperatures. 

In this study, we evaluate ARC3O by simulating brightness temperatures based on three assimilation runs of the MPI Earth System Model (MPI-ESM) assimilated with three different sea-ice concentration products. We then compare these three sets of simulated brightness temperatures to brightness temperatures measured by the Advanced Microwave Scanning Radiometer Earth Observing System (AMSR-E) from space. We find that they differ up to 10 K in the period between October and June, depending on the region and the assimilation run. However, we show that these discrepancies between simulated and observed brightness temperature can be mainly attributed to the underlying observational uncertainty in sea-ice concentration and, to a lesser extent, to the data assimilation process, rather than to biases in ARC3O itself. In summer, the discrepancies between simulated and observed brightness temperatures are larger than in winter and locally reach up to 20 K. This is caused by the very large observational uncertainty in summer sea-ice concentration but also by the melt-pond parametrisation in MPI-ESM, which is not necessarily realistic. 

ARC3O is therefore capable to realistically translate the simulated Arctic Ocean climate state into one observable quantity for a more comprehensive climate model evaluation and initialisation, an exciting perspective for further developing this and similar methods.

How to cite: Burgard, C., Notz, D., Pedersen, L. T., and Tonboe, R. T.: The Arctic Ocean Observation Operator for 6.9 GHz (ARC3O), EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-3501, https://doi.org/10.5194/egusphere-egu2020-3501, 2020.

D689 |
Christian Pagé, Wim Som de Cerff, Maarten Plieger, Alessandro Spinuso, Iraklis Klampanos, Malcolm Atkinson, and Vangelis Karkaletsis

Accessing and processing large climate data has nowadays become a particularly challenging task for end users, due to the rapidly increasing volumes being produced and made available. Access to climate data is crucial for sustaining research and performing climate change impact assessments. These activities have strong societal impact as climate change affects and requires that almost all economic and social sectors need adapting.

The whole climate data archive is expected to reach a volume of 30 PB in 2020 and up to 2000 PB in 2024 (estimated), evolving from 0.03 PB (30 TB) in 2007 and 2 PB in 2014. Data processing and analysis must now take place remotely for the users: users typically have to rely on heterogeneous infrastructures and services between the data and their physical location. Developers of Research Infrastructures have to provide services to those users, hence having to define standards and generic services to fulfil those requirements.

It will be shown how the DARE eScience Platform (http://project-dare.eu) will help developers to develop needed services more quickly and transparently for a large range of scientific researchers. The platform is designed for efficient and traceable development of complex experiments and domain-specific services. Most importantly, the DARE Platform integrates the following e-infrastructure services: the climate IS-ENES (https://is.enes.org) Research Infrastructure front-end climate4impact (C4I: https://climate4impact.eu), the EUDAT CDI (https://www.eudat.eu/eudat-collaborative-data-infrastructure-cdi) B2DROP Service, as well as the ESGF (https://esgf.llnl.gov). The DARE Platform itself can be deployed by research communities on local, public or commercial clouds, thanks to its containerized architecture.

More specifically, two distinct Use Cases for the climate science domain will be presented. The first will show how an open source software to compute climate indices and indicators (icclim: https://github.com/cerfacs-globc/icclim) is leveraged using the DARE Platform to enable users to build their own workflows. The second Use Case will demonstrate how more complex tools, such as an extra-tropical and tropical cyclone tracking software (https://github.com/cerfacs-globc/cyclone_tracking), can be easily made available to end users by infrastructure and front-end software developers.

How to cite: Pagé, C., Som de Cerff, W., Plieger, M., Spinuso, A., Klampanos, I., Atkinson, M., and Karkaletsis, V.: Integrating e-infrastructures for remote climate data processing, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-4658, https://doi.org/10.5194/egusphere-egu2020-4658, 2020.

D690 |
Fabian Wachsmann

The Climate Data Operators [1] tool kit (CDO) is a worldwide popular infrastructure software developed and maintained at the Max Planck Institute for Meteorology (MPI-M). It comprises a large number of command line operators for gridded data, including statistics, interpolation, or arithmetics. Users benefit from the extensive support facilities provided by the MPI-M and the DKRZ.

As a part of the sixth phase of the Coupled Model Intercomparison Project (CMIP6), the German Federal Ministry of Education and Research (BMBF) is funding activities promoting the use of the CDOs for CMIP6 data preparation and analysis.  

The operator ‘cmor’ has been developed to enable users to prepare their data according to the CMIP6 data standard. It is part of the web-based CMIP6 post-processing infrastructure [2] which is developed at DKRZ and used by different Earth System Models. The CDO metadata and its data model have been expanded to include the CMIP6 data standard so that users can use the tool for project data evaluation.

As a second activity, operators for 27 climate extremes indices, which were defined by the Expert Team on Climate Change Detection and Indices (ETCCDI), have been integrated into the tool. As with CMIP5, the ETCCDI climate extremes indices will be part of CMIP6 model analyses due to their robustness and straightforward interpretation.

This contribution provides an insight into advanced CDO application and offers ideas for post-processing optimization. 

[1] Schulzweida, U. (2019): CDO user guide. code.mpimet.mpg.de/projects/cdo , last access: 01.13.2020.

[2] Schupfner, M. (2020):  The CMIP6 Data Request WebGUI. c6dreq.dkrz.de , last access: 01.13.2020.

How to cite: Wachsmann, F.: CDOs for CMIP6 and Climate Extremes Indices, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-8543, https://doi.org/10.5194/egusphere-egu2020-8543, 2020.

D691 |
| Highlight
Dmytro Trybushnyi, Wolfgang Raskob, Ievgen Ievdin, Tim Müller, Oleksandr Pylypenko, and Mark Zheleznyak

An important aspect of an Earth Systems Science Prediction Systems (ESSPS) is to describe and predict the behavior of contaminants in different environmental compartments following severe accidents at chemical and nuclear installations. Such an ESSPS could be designed as a platform allowing to integrate models describing atmospheric, hydrological, oceanographic processes, physical-chemical transformation of the pollutants in the environment, contamination of food chain, and finally the overall exposure of the population with harmful substances. Such a chain of connected simulation models needed to describe the consequences of severe accidents in the different phases of an emergency should use different input data ranging from real-time online meteorological to long-term numerical weather prediction or ocean data.

One example of an ESSPS is the Decision Support Systems JRODOS for off-site emergency management after nuclear emergencies. It integrates many different simulation models, real-time monitoring, regional GIS information, source term databases, and geospatial data for population and environmental characteristics.

The development of the system started in 1992 supported by European Commission’s RTD Framework programs. Attracting more and more end users, the technical basis of of the system had to be considerably improved. For this, Java has been selected as a high level software language suitable for development of distributed cross-platform enterprise quality applications. From the other hand, a great deal of scientific computational software is available only as C/C++/FORTRAN packages. Moreover, it is a common scenario when some outputs of model A should act as inputs of model B, but the two models do not share common exchange containers and/or are written in different programming languages.

To combine the flexibility of Java language and the speed and availability of scientific codes, and to be able to connect different computational codes into one chain of models, the notion of distributed wrapper objects (DWO) has been introduced. DWO provides logical, visual and technical means for the integration of computational models into the core of the system system, even if models and the system use different programming languages. The DWO technology allows various levels of interactivity including pull- and push driven chains, user interaction support, and sub-models calls. All the DWO data exchange is realized in memory and does not include IO disk operations, thus eliminating redundant reader/writer code and minimizing slow disk access. These features introduce more stability and performance of an ESSPS that is used for decision support.

The current status of the DWO realization in JRODOS is presented focusing on the added value compared to traditional integration of different simulation models into one system.

How to cite: Trybushnyi, D., Raskob, W., Ievdin, I., Müller, T., Pylypenko, O., and Zheleznyak, M.: Flexible Java based platform for integration of models and datasets in Earth Systems Science Prediction Systems: methodology and implementation for predicting spreading of radioactive contamination from accidents, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-9532, https://doi.org/10.5194/egusphere-egu2020-9532, 2020.

D692 |
Martin Schupfner and Fabian Wachsmann

CMIP6 defines a data standard as well as a data request (DReq) in order to facilitate analysis across results from different climate models. For most model output, post-processing is required to make it CMIP6 compliant. The German Federal Ministry of Education and Research (BMBF) is funding a project [1] providing services which help with the production of quality-assured CMIP6 compliant data according to the DReq. 


In that project, a web-based GUI [2] has been developed which guides the modelers through the different steps of the data post-processing workflow, allowing to orchestrate the aggregation, diagnostic and standardizing of the model data in a modular manner. Therefor the website provides several functionalities:
1. A DReq generator, based on Martin Juckes’ DreqPy API [3], can be used to tailor the DReq according to the envisaged experiments and supported MIPs. Moreover, the expected data volume can be calculated.

2. The mapping between variables of the DReq and of the raw model output can be specified. These specifications (model variable names, units, etc.) may include diagnostic algorithms and are stored in a database. 

3. The variable mapping information can be retrieved as a mapping table (MT). Additionally, this information can be used to create post-processing script fragments. One of the script fragments contains processing commands based on the diagnostic algorithms entered into the mapping GUI, whereas the other rewrites the (diagnosed) data in a CMIP6 compliant format. Both script fragments use the CDO tool kit [4] developed at the Max Planck Institute for Meteorology, namely the CDO expr and cmor [5] operators. The latter makes use of the CMOR3 library [6] and parses the MT. The script fragments are meant to be integrated into CMIP6 data workflows or scripts. A template for such a script, that allows for a modular and flexible process control of the single workflow steps, will be included when downloading the script fragments.

4. User specific metadata can be generated, which supply the CDO cmor operator with the required and correct metadata as specified in the CMIP6 controlled vocabulary (CV).


[1] National CMIP6 Support Activities. https://www.dkrz.de/c6de , last access 9.1.2020.

[2] Martin Schupfner (2018): CMIP6 Data Request WebGUI. https://c6dreq.dkrz.de/ , last access 9.1.2020.

[3] Martin Juckes (2018): Data Request Python API. Vers. 01.00.28. http://proj.badc.rl.ac.uk/svn/exarch/CMIP6dreq/tags/latest/dreqPy/docs/dreqPy.pdf , last access 9.1.2020.  

[4] Uwe Schulzweida (2019): CDO User Guide. Climate Data Operators. Vers. 1.9.8. https://code.mpimet.mpg.de/projects/cdo/embedded/cdo.pdf , last access 9.1.2020.

[5] Fabian Wachsmann (2017): The cdo cmor operator. https://code.mpimet.mpg.de/attachments/19411/cdo_cmor.pdf , last access 9.1.2020.

[6] Denis Nadeau (2018): CMOR version 3.3. https://cmor.llnl.gov/pdf/mydoc.pdf , last access 9.1.2020.

How to cite: Schupfner, M. and Wachsmann, F.: Web-based post-processing workflow composition for CMIP6, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-13105, https://doi.org/10.5194/egusphere-egu2020-13105, 2020.

D693 |
| Highlight
Axel Lauer, Fernando Iglesias-Suarez, Veronika Eyring, and the ESMValTool development team

The Earth System Model Evaluation Tool (ESMValTool) has been developed with the aim of taking model evaluation to the next level by facilitating analysis of many different ESM components, providing well-documented source code and scientific background of implemented diagnostics and metrics and allowing for traceability and reproducibility of results (provenance). This has been made possible by a lively and growing development community continuously improving the tool supported by multiple national and European projects. The latest version (2.0) of the ESMValTool has been developed as a large community effort to specifically target the increased data volume of the Coupled Model Intercomparison Project Phase 6 (CMIP6) and the related challenges posed by analysis and evaluation of output from multiple high-resolution and complex ESMs. For this, the core functionalities have been completely rewritten in order to take advantage of state-of-the-art computational libraries and methods to allow for efficient and user-friendly data processing. Common operations on the input data such as regridding or computation of multi-model statistics are now centralized in a highly optimized preprocessor written in Python. The diagnostic part of the ESMValTool includes a large collection of standard recipes for reproducing peer-reviewed analyses of many variables across atmosphere, ocean, and land domains, with diagnostics and performance metrics focusing on the mean-state, trends, variability and important processes, phenomena, as well as emergent constraints. While most of the diagnostics use observational data sets (in particular satellite and ground-based observations) or reanalysis products for model evaluation some are also based on model-to-model comparisons. This presentation introduces the diagnostics newly implemented into ESMValTool v2.0 including an extended set of large-scale diagnostics for quasi-operational and comprehensive evaluation of ESMs, new diagnostics for extreme events, regional model and impact evaluation and analysis of ESMs, as well as diagnostics for emergent constraints and analysis of future projections from ESMs. The new diagnostics are illustrated with examples using results from the well-established CMIP5 and the newly available CMIP6 data sets.

How to cite: Lauer, A., Iglesias-Suarez, F., Eyring, V., and development team, T. E.: CMIP model evaluation with the ESMValTool v2.0, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-13306, https://doi.org/10.5194/egusphere-egu2020-13306, 2020.

D694 |
Fakhereh Alidoost, Jerom Aerts, Bouwe Andela, Jaro Camphuijsen, Nick van De Giesen, Gijs van Den Oord, Niels Drost, Yifat Dzigan, Ronald van Haren, Rolf Hut, Peter C. Kalverla, Inti Pelupessy, Stefan Verhoeven, Berend Weel, and Ben van Werkhoven

eWaterCycle is a framework in which hydrological modelers can work together in a collaborative environment. In this environment, they can, for example, compare and analyze the results of models that use different sources of (meteorological) forcing data. The final goal of eWaterCycle is to advance the state of FAIR (Findable, Accessible, Interoperable, and Reusable) and open science in hydrological modeling.

Comparing hydrological models has always been a challenging task. Hydrological models exhibit great complexity and diversity in the exact methodologies applied, competing for hypotheses of hydrologic behavior, technology stacks, and programming languages used in those models. Pre-processing of forcing data is one of the roadblocks that was identified during the FAIR Hydrological Modelling workshop organized by the Lorentz Center in April 2019. Forcing data can be retrieved from a wide variety of sources with discrepant variable names and frequencies, and spatial and temporal resolutions. Moreover, some hydrological models make specific assumptions about the definition of the forcing variables. The pre-processing is often performed by various sets of scripts that may or may not be included with model source codes, making it hard to reproduce results. Generally, there are common steps in the data preparation among different models. Therefore, it would be a valuable asset to the hydrological community if the pre-processing of FAIR input data could also be done in a FAIR manner.

Within the context of the eWaterCycle II project, a common pre-processing system has been created for hydrological modeling based on ESMValTool (Earth System Model Evaluation Tool). ESMValTool is a community diagnostic and performance metrics tool developed for the evaluation of Earth system models. The ESMValTool pre-processing functions cover a broad range of operations on data before diagnostics or metrics are applied; for example, vertical interpolation, land-sea masking, re-gridding, multi-model statistics, temporal and spatial manipulations, variable derivation and unit conversion. The pre-processor performs these operations in a centralized, documented and efficient way. The current pre-processing pipeline of the eWaterCycle using ESMValTool consists of hydrological model-specific recipes and supports ERA5 and ERA-Interim data provided by the ECMWF (European Centre for Medium-Range Weather Forecasts). The pipeline starts with the downloading and CMORization (Climate Model Output Rewriter) of input data. Then a recipe is prepared to find the data and run the preprocessors. When ESMValTool runs a recipe, it will also run the diagnostic script that contains model-specific analysis to derive required forcing variables, and it will store provenance information to ensure transparency and reproducibility. In the near future, the pipeline is extended to include Earth observation data, as these data are paramount to the data assimilation in eWaterCycle.

In this presentation we will show how using the pre-processor from ESMValTool for Hydrological modeling leads to connecting Hydrology and Climate sciences, and increase the impact and sustainability of ESMValTool.

How to cite: Alidoost, F., Aerts, J., Andela, B., Camphuijsen, J., van De Giesen, N., van Den Oord, G., Drost, N., Dzigan, Y., van Haren, R., Hut, R., Kalverla, P. C., Pelupessy, I., Verhoeven, S., Weel, B., and van Werkhoven, B.: ESMValTool pre-processing functions for eWaterCycle, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-14745, https://doi.org/10.5194/egusphere-egu2020-14745, 2020.

D695 |
Bouwe Andela, Lisa Bock, Björn Brötz, Faruk Diblen, Laura Dreyer, Niels Drost, Paul Earnshaw, Veronika Eyring, Birgit Hassler, Nikolay Koldunov, Axel Lauer, Bill Little, Saskia Loosveldt-Tomas, Lee de Mora, Valeriu Predoi, Mattia Righi, Manuel Schlund, Javier Vegas-Regidor, and Klaus Zimmermann

The Earth System Model Evaluation Tool (ESMValTool) is a free and open-source community diagnostic and performance metrics tool for the evaluation of Earth system models participating in the Coupled Model Intercomparison Project (CMIP). Version 2 of the tool (Righi et al. 2019, www.esmvaltool.org) features a brand new design, consisting of ESMValCore (https://github.com/esmvalgroup/esmvalcore), a package for working with CMIP data and ESMValTool (https://github.com/esmvalgroup/esmvaltool), a package containing the scientific analysis scripts. This new version has been specifically developed to handle the increased data volume of CMIP Phase 6 (CMIP6) and the related challenges posed by the analysis and the evaluation of output from multiple high-resolution or complex Earth system models. The tool also supports CMIP5 and CMIP3 datasets, as well as a large number of re-analysis and observational datasets that can be formatted according to the same standards (CMOR) on-the-fly or through scripts currently included in the ESMValTool package.

At the heart of this new version is the ESMValCore software package, which provides a configurable framework for finding CMIP files using a “data reference syntax”, applying commonly used pre-processing functions to them, running analysis scripts, and recording provenance. Numerous pre-processing functions, e.g. for data selection, regridding, and statistics are readily available and the modular design makes it easy to add more. The ESMValCore package is easy to install with relatively few dependencies, written in Python 3, based on state-of-the-art open-source libraries such as Iris and Dask, and widely used standards such as YAML, NetCDF, CF-Conventions, and W3C PROV. An extensive set of automated tests and code quality checks ensure the reliability of the package. Documentation is available at https://esmvaltool.readthedocs.io.

The ESMValCore package uses human-readable recipes to define which variables and datasets to use, how to pre-process that data, and what scientific analysis scripts to run. The package provides convenient interfaces, based on the YAML and NetCDF/CF-convention file formats, for running diagnostic scripts written in any programming language. Because the ESMValCore framework takes care of running the workflow defined in the recipe in parallel, most analyses run much faster, with no additional programming effort required from the authors of the analysis scripts. For example, benchmarks show a factor of 30 speedup with respect to version 1 of the tool for a representative recipe on a 24 core machine. A large collection of standard recipes and associated analysis scripts is available in the ESMValTool package for reproducing selected peer-reviewed analyses. The ESMValCore package can also be used with any other script that implements it’s easy to use interface. All pre-processing functions of the ESMValCore can also be used directly from any Python program. These features allow for use by a wide community of scientific users and developers with different levels of programming skills and experience.

Future plans involve extending the public Python API (application programming interface) from just preprocessor functions to include all functionality, including finding the data and running diagnostic scripts. This would make ESMValCore suitable for interactive data exploration from a Jupyter Notebook.

How to cite: Andela, B., Bock, L., Brötz, B., Diblen, F., Dreyer, L., Drost, N., Earnshaw, P., Eyring, V., Hassler, B., Koldunov, N., Lauer, A., Little, B., Loosveldt-Tomas, S., de Mora, L., Predoi, V., Righi, M., Schlund, M., Vegas-Regidor, J., and Zimmermann, K.: ESMValCore: analyzing CMIP data made easy, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-17472, https://doi.org/10.5194/egusphere-egu2020-17472, 2020.

D696 |
Salomon Eliasson, Karl Göran Karlsson, and Ulrika Willén

One of the primary purposes of satellite simulators is to emulate the inability of retrievals, based on visible and infrared sensors, to detect subvisible clouds from space by removing them from the model. The current simulators in the COSP rely on a single visible cloud optical depth (τ)-threshold (τ=0.3) applied globally to delineate cloudy and cloud-free conditions. However, in reality, the cloud sensitivity of a retrieval varies regionally.

This presentation describes the satellite simulator for the CLARA-A2 climate data record (CDR). The CLARA simulator takes into account the variable
skill in cloud detection of the CLARA-A2 CDR using long/lat-gridded values separated by daytime and nighttime, which enable it to filter out clouds from
climate models that would be undetectable by observations. We introduce two methods of cloud mask simulation, one that depends on a spatially variable
τ-threshold and one that uses the cloud probability of detection (POD) as a function of the model τ and long/lat. The gridded POD values are from the
CLARA-A2 validation study by Karlsson and Hakansson (2018).

Both methods replicate the relative ease or difficulty for cloud retrievals, depending on the region and illumination. They increase the cloud sensitivity where the cloud retrievals are relatively straightforward, such as over mid-latitude oceans, and they decrease the sensitivity where cloud retrievals are
notoriously tricky, such as where thick clouds may be inseparable from cold, snow-covered surfaces, as well as in areas with an abundance of broken and
small-scale cumulus clouds such as the atmospheric subsidence regions over the ocean.

The CLARA simulator, together with the International Satellite Cloud Climatology Project (ISCCP) simulator of the COSP, is used to assess Arctic clouds in the EC-Earth climate model compared to the CLARA-A2 and ISCCP H-Series CDRs. Compared to CLARA-A2, EC-Earth generally underestimates cloudiness in the Arctic. However, compared to ISCCP and its simulator, the opposite conclusion is reached. Based on EC-Earth, this paper shows that the simulated cloud mask of CLARA-A2 is more representative of the CDR than using a global optical depth threshold, such as used by the ISCCP simulator.
The simulator substantially improves the simulation of the CLARA-A2-detected clouds compared to a global optical depth threshold, especially in the polar regions, by accounting for the variable cloud detection skill over the year.

How to cite: Eliasson, S., Karlsson, K. G., and Willén, U.: A simulator for the CLARA-A2 cloud climate data record and its application to assess EC-Earth polar cloudiness, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-18454, https://doi.org/10.5194/egusphere-egu2020-18454, 2020.

D697 |
Valeriu Predoi, Bouwe Andela, Lee De Mora, and Axel Lauer

The Earth System Model eValuation Tool (ESMValTool) is a powerful community-driven diagnostics and performance metrics tool. It is used for the evaluation of Earth System Models (ESMs) and allows for routine comparisons of either multiple model versions or observational datasets. ESMValTool's design is highly modular and flexible so that additional analyses can easily be added; in fact, this is essential to encourage the community-based approach to its scientific development. A set of standardized recipes for each scientific topic reproduces specific diagnostics or performance metrics that have demonstrated their importance in ESM evaluation in the peer-reviewed literature. Scientific themes include selected Essential Climate Variables, a range of known systematic biases common to ESMs such as coupled tropical climate variability, monsoons, Southern Ocean processes, continental dry biases and soil hydrology-climate interactions, as well as atmospheric CO3 budgets, tropospheric and stratospheric ozone, and tropospheric aerosols. We will outline the main functional characteristics of ESMValTool Version 2; we will also introduce the reader to the current set of diagnostics and the methods they can use to contribute to its development.

How to cite: Predoi, V., Andela, B., De Mora, L., and Lauer, A.: ESMValTool - introducing a powerful model evaluation tool, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-19181, https://doi.org/10.5194/egusphere-egu2020-19181, 2020.

D698 |
Philipp S. Sommer, Ronny Petrik, Beate Geyer, Ulrike Kleeberg, Dietmar Sauer, Linda Baldewein, Robin Luckey, Lars Möller, Housam Dibeh, and Christopher Kadow

The complexity of Earth System and Regional Climate Models represents a considerable challenge for developers. Tuning but also improving one aspect of a model can unexpectedly decrease the performance of others and introduces hidden errors. Reasons are in particular the multitude of output parameters and the shortage of reliable and complete observational datasets. One possibility to overcome these issues is a rigorous and continuous scientific evaluation of the model. This requires standardized model output and, most notably, standardized observational datasets. Additionally, in order to reduce the extra burden for the single scientist, this evaluation has to be as close as possible to the standard workflow of the researcher, and it needs to be flexible enough to adapt it to new scientific questions.

We present the Free Evaluation System Framework (Freva) implementation within the Helmholtz Coastal Data Center (HCDC) at the Institute of Coastal Research in the Helmholtz-Zentrum Geesthacht (HZG). Various plugins into the Freva software, namely the HZG-EvaSuite, use observational data to perform a standardized evaluation of the model simulation. We present a comprehensive data management infrastructure that copes with the heterogeneity of observations and simulations. This web framework comprises a FAIR and standardized database of both, large-scale and in-situ observations exported to a format suitable for data-model intercomparisons (particularly netCDF following the CF-conventions). Our pipeline links the raw data of the individual model simulations (i.e. the production of the results) to the finally published results (i.e. the released data). 

Another benefit of the Freva-based evaluation is the enhanced exchange between the different compartments of the institute, particularly between the model developers and the data collectors, as Freva contains built-in functionalities to share and discuss results with colleagues. We will furthermore use the tool to strengthen the active communication with the data and software managers of the institute to generate or adapt the evaluation plugins.

How to cite: Sommer, P. S., Petrik, R., Geyer, B., Kleeberg, U., Sauer, D., Baldewein, L., Luckey, R., Möller, L., Dibeh, H., and Kadow, C.: Integrating Model Evaluation and Observations into a Production-Release Pipeline , EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-19298, https://doi.org/10.5194/egusphere-egu2020-19298, 2020.

D699 |
Christopher Kadow, Sebastian Illing, Oliver Kunst, Thomas Schartner, Jens Grieger, Mareike Schuster, Andy Richling, Ingo Kirchner, Henning Rust, Ulrich Cubasch, and Uwe Ulbrich

The Free Evaluation System Framework (Freva - freva.met.fu-berlin.de) is a software infrastructure for standardized data and tool solutions in Earth system science. Freva runs on high performance computers to handle customizable evaluation systems of research projects, institutes or universities. It combines different software technologies into one common hybrid infrastructure, including all features present in the shell and web environment. The database interface satisfies the international standards provided by the Earth System Grid Federation (ESGF). Freva indexes different data projects into one common search environment by storing the meta data information of the self-describing model, reanalysis and observational data sets in a database. This implemented meta data system with its advanced but easy-to-handle search tool supports users, developers and their plugins to retrieve the required information. A generic application programming interface (API) allows scientific developers to connect their analysis tools with the evaluation system independently of the programming language used. Users of the evaluation techniques benefit from the common interface of the evaluation system without any need to understand the different scripting languages. Facilitation of the provision and usage of tools and climate data automatically increases the number of scientists working with the data sets and identifying discrepancies. The integrated webshell (shellinabox) adds a degree of freedom in the choice of the working environment and can be used as a gate to the research projects HPC. Plugins are able to integrate their e.g. post-processed results into the database of the user. This allows e.g. post-processing plugins to feed statistical analysis plugins, which fosters an active exchange between plugin developers of a research project. Additionally, the history and configuration sub-systemstores every analysis performed with the evaluation system in a database. Configurations and results of the toolscan be shared among scientists via shell or web system. Therefore, plugged-in tools benefit from transparency and reproducibility. Furthermore, if configurations match while starting an evaluation plugin, the system suggests touse results already produced by other users – saving CPU/h, I/O, disk space and time. The efficient interaction between different technologies improves the Earth system modeling science framed by Freva.

New Features and aspects of further development and collaboration are discussed.


How to cite: Kadow, C., Illing, S., Kunst, O., Schartner, T., Grieger, J., Schuster, M., Richling, A., Kirchner, I., Rust, H., Cubasch, U., and Ulbrich, U.: Freva - Free Evaluation System Framework - New Aspects and Features, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-21666, https://doi.org/10.5194/egusphere-egu2020-21666, 2020.

D700 |
Klaus Zimmermann and Lars Bärring

Climate indices play an important role in the practical use of climate and weather data. Their application spans a wide range of topics, from impact assessment in agriculture and urban planning, over indispensable advice in the energy sector, to important evaluation in the climate science community. Several widely used standard sets of indices exist through long-standing efforts of WMO and WCRP Expert Teams (ETCCDI and ET-SCI), as well as European initiatives (ECA&D) and more recently Copernicus C3S activities. They, however, focus on the data themselves, leaving much of the metadata to the individual user. Moreover, these core sets of indices lack a coherent metadata framework that would allow for the consistent inclusion of new indices that continue to be considered every day.

In the meantime, the treatment of metadata in the wider community has received much attention. Within the climate community efforts such as the CF convention and the much-expanded scope and detail of metadata in CMIP6 have improved the clarity and long-term usability of many aspects of climate data a great deal.

We present a novel approach to metadata for climate indices. Our format describes the existing climate indices consistent with the established standards, adding metadata along the lines of existing metadata specifications. The formulation of these additions in a coherent framework encompassing most of the existing climate index standards allows for its easy extension and inclusion of new climate indices as they are developed.

We also present Climix, a new Python software for the calculation of indices based on this description. It can be seen as an example implementation of the proposed standard and features high-performance calculations based on state-of-the-art infrastructure, such as Iris and Dask. This way, it offers shared memory and distributed parallel and out-of-core computations, enabling the efficient treatment of large data volumes as incurred by the high resolution, long time-series of current and future datasets.

How to cite: Zimmermann, K. and Bärring, L.: Climate Index Metadata and its Implementation, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-22155, https://doi.org/10.5194/egusphere-egu2020-22155, 2020.