ES1.2
Creating value through Open Data in the cloud

ES1.2

Creating value through Open Data in the cloud
Convener: Hella Riede | Co-conveners: Renate Hagedorn, Roope Tervo, Björn Reetz, Håvard Futsæter
Lightning talks
| Tue, 07 Sep, 09:00–10:30 (CEST)
Public information:

10:00 - 10:30 Breakout rooms for this session

Room 1: chaired by Björn Reetz and Håvard Futsæter

  1. From open data to global digital public good  |  Håvard  Futsæter
  2. FAIR principles for climate services information systems  |  Nils Hempelmann
  3. Agroclimatic atlas – prototype  |  Pavel Hájek
  4. Interactive access to climate data from Germany  |  Frank Kratzenstein
  5. DWD Geoportal – Converging open data, metadata and documentation in a user-friendly way  |  Björn Reetz

Room 2: chaired by Roope Tervo and Hella Riede

  1. Serving Open Data from mixed on-premise and cloud environment at the Finnish Meteorological Institute  |  Mikko Visa
  2. Making ECMWF Open Data more easily accessible via cloud-based services  |  Julia Wagemann
  3. NWP Data availability notifications for meteorological workflows across HPC and Cloud data centres  |  Claudio Iacopino
  4. Towards a modernized Copernicus Climate and Atmosphere Data Stores  |  Angel Alos

Lightning talks: Tue, 7 Sep

Chairpersons: Hella Riede, Håvard Futsæter, Roope Tervo
09:00–09:05
09:05–09:10
|
EMS2021-200
Håvard Futsæter

MET Norway has had an open data policy for many years. A permissive open data license, and a freely accessible service through which to gain access to the dataset is the first step. However, the data is not useful before it is understood and used in decision-making.

MET Norway serves many user groups, many of which have very different needs for open meteorological data. To cater for the different user needs, MET Norway provides multiple distribution services. One of our most important open data data services is MET Norway Weather API, a global location based time series forecast service. (https://developer.yr.no/featured-products/forecast/)

MET Norway has recently joined the Digital Public Goods Alliance, to help reach the Sustainable Development Goals(SDG) (https://sdgs.un.org/goals)  by leveraging our MET Norway Weather API service as a digital public good.

“The Digital Public Goods Alliance is a multi-stakeholder initiative with a mission to accelerate the attainment of the sustainable development goals in low- and middle-income countries by facilitating the discovery, development, use of, and investment in digital public goods.” (https://digitalpublicgoods.net/about/)

Moving from open data to a digital public good has meant taking a more active part in identifying, exploring and understanding the needs that low -and middle-income countries have. The needs considered are both end-user needs and gaps/tools/competency needs across the value chain. And we are trying to find ways our data and services can help fill those needs in an operational sustainable way by co-creating applications built on top of our services.

In this presentation we will first describe our experience with serving open and free weather forecast data. Then describe the challenges in moving from open data to working with our data as an SDG.

The presentation will be focused both on user needs and on technical challenges connected to running a global freely available open data service.

How to cite: Futsæter, H.: From open data to global digital public good, EMS Annual Meeting 2021, online, 6–10 Sep 2021, EMS2021-200, https://doi.org/10.5194/ems2021-200, 2021.

09:10–09:15
|
EMS2021-448
Mikko Visa and Roope Tervo

Finnish Meteorological Institute has a long history of open data. Partly as a result of the INSPIRE directive almost all important data was opened back in 2013. Because of this we have quite a long history of usage of the data and as well experience on technical solutions and user needs. The presentation will open up the current status and future development keeping in mind the upcoming WMO WIS2 development as well as the Open Data directive with its High Value Dataset proposal which will very likely feature meteorological datasets.

Data is provided via machine-readable interfaces as well as human usable web interfaces. We use on-premise storage and interfaces and in addition also offer cloud-based distribution such as the Amazon Public Dataset program. The current operational interfaces are based on WFS 2.0 and WMS. Most recently added datasets include weather and flood warnings in Common Alerting Protocol (CAP) format, black carbon measurements and radar data archive via Amazon S3 in GeoTIFF and HDF5 formats. There is development starting for providing data via even more developer-friendly interfaces such as the OGC Features API. Also new data is being added continuously based on our own and user needs.

An impact study has also been conducted for the year 2018 which reveals some findings on what data is used and how it impacts the users and their potential businesses. Also valuable information on the future needs of users was gathered and the most important findings of this study will be presented during the session.

How to cite: Visa, M. and Tervo, R.: Serving Open Data from mixed on-premise and cloud environment at the Finnish Meteorological Institute., EMS Annual Meeting 2021, online, 6–10 Sep 2021, EMS2021-448, https://doi.org/10.5194/ems2021-448, 2021.

09:15–09:20
|
EMS2021-488
|
Nils Hempelmann, Ingo Simonis, Carsten Ehbrecht, and David Huard

Ongoing climate change is increasingly impacting ecosystems and living conditions. To understand climate change effects on all scales ranging from regional to global and to develop appropriate response strategies, reliable, easily accessible climate location information is crucial. The United Nations framework of climate change policy emphasizes the role of open data as an essential component to enable efficient implementation of appropriate climate change strategies. Data offered at the various portals and climate services needs to be Findable, Accessible, Interoperable, and Reusable (FAIR). This is particularly important when several communities need to work together in order to develop the most effective response strategies. These communities not only involve climate scientists and meteorologists, but also climate impact analysts, hydrologists, agronomists, urban planners, ecologists, and many more. Screening the web for available data, it becomes apparent that there is no shortage of portal solutions built upon climate data archives. Portal solutions have turned out in the past of often being targeted towards a specific, and sometimes rather small, number of users from within a single community. Cross-community integration and thus enhanced reusability and interoperability was not in focus. Due to recent ongoing international domain crossing efforts, FAIR principles are increasingly respected also for the portal architectures of the information systems itself. For example, the Open Geospatial Consortium (OGC) develops standards and best practices that enable FAIR principles across communities.

FAIR principles across communities require a set of essential ingredients to work effectively. These ingredients include metadata models that allow discovery (Findable), interfaces to access the data (Accessible), data models that are well documented (Interoperable) and can be efficiently consumed by others (Reusable). Because data volumes are continuously growing and therefore require new approaches for efficient data processing, OGC has extended the ‘Reusable’ component in FAIR. ‘Reusable’ now includes mechanisms for executing applications close to the physical location of the data. What was previously a data provisioning system now needs to be extended to support processing capacities up to the level where user-defined applications can be deployed and executed. In a sense, for data to be FAIR, it needs to be accompanied by equally FAIR services. 

This presentation is showing current realisations of leading climate services information systems that implement the extended FAIR principle. The presentation will sort out roles and capabilities of standardized web APIs that can be assembled in line with data and processing environments for interoperable climate data across communities in the most efficient way. Once paired with OGC’s new “Applications-to-the-Data” architecture and strong metadata models, the web APIs enable effective integration of climate data with data from other disciplines within state-of-the art cloud environments that feature not only reusability of data, but also of applications, data processes, and scientific workflows.

How to cite: Hempelmann, N., Simonis, I., Ehbrecht, C., and Huard, D.: FAIR principles for climate services information systems, EMS Annual Meeting 2021, online, 6–10 Sep 2021, EMS2021-488, https://doi.org/10.5194/ems2021-488, 2021.

09:20–09:25
|
EMS2021-468
|
Julia Wagemann, Umberto Modigliani, Stephan Siemen, Vasileios Baousis, and Florian Pappenberger

The European Centre for Medium-Range Weather Forecasts (ECMWF) is moving gradually towards an open data licence , aiming to make real-time forecast data available under a full, free and open data license by 2025. The introduction of open data policies lead in general to an increase in data requests and a broader user base. Therefore a much larger community of diverse users will be interested in accessing, understanding and using ECMWF Open Data (real-time). While an open data license is an important prerequisite, it does not automatically lead to an increased uptake of open data. In order to increase the uptake of (open) data, Wilkinson et al. (2016) defined the FAIR principles, which emphasize the need to make data better ‘findable’, ‘accessible’, ‘interoperable’ and ‘reusable’.

In 2019, we conducted a web-based survey among users of big Earth data to obtain a better understanding of users’ needs in terms of the data they are interested in, the applications they need the data for, the way they access and process data and the challenges they face. The results show that users are in particular interested in meteorological and climate forecast data, but facing challenges related to the growing data volumes, the data heterogeneity and the limited processing capacities. At the same time, survey respondents showed an interest in using cloud-based services in the near future, but expressed the need for an easier data discovery and the interoperability of data systems. Moreover, an ECMWF supported activity that made a subset of ERA5 climate reanalysis data available to the user community of the Google Earth Engine platform, revealed that interoperability of data systems is a growing bottleneck. 

Conclusions from both activities are helping ECMWF to define the way forward to make ECMWF Open Data (real-time) better accessible via cloud-based services. In this presentation we would like to share and discuss lessons learned to make open data more easily ‘accessible’ and ‘interoperable’ and the role cloud-based services play in doing so. We will also cover our future plans.

How to cite: Wagemann, J., Modigliani, U., Siemen, S., Baousis, V., and Pappenberger, F.: Making ECMWF Open Data more easily accessible via cloud-based services, EMS Annual Meeting 2021, online, 6–10 Sep 2021, EMS2021-468, https://doi.org/10.5194/ems2021-468, 2021.

09:25–09:30
|
EMS2021-475
|
Karel Jedlička, Pavel Hájek, Tomáš Andrš, Otakar Čerba, Jiří Valeš, and František Zadražil

Our contribution presents a prototype of Agroclimatic atlas - a web map application, presenting agroclimatic factors: Frost-free period, Water balance, Total precipitation, Total solar radiation, Last date with soil temperature above 10 °C for nitrogen application, Number of days with growing temperatures for a crop, Number of days with optimal growing temperatures for a crop HSU - Heat stress units for a crop, The factors are calculated based on algorithms described in Calculation of Agro-Climatic Factors from Global Climatic Data (Jedlička et al. 2021, doi:  10.3390/app11031245).

The agroclimatic atlas application aims to provide a comprehensive overview of agriculture-related climatic characteristics of an area of interest in a time retrospective.  The application can be used by both an individual farmer or a precision farming expert exploring a wider area.

The principal source of climatic variables (such as temperature, rainfall, evaporation, runoff, and solar radiation) used in the atlas is the ERA5-Land dataset (available as the Copernicus Climate Change Service (C3S) at its Climate Date Store). 

The contemporary version of the Agroclimatic Atlas application is accessible from here https://www.mdpi.com/2076-3417/11/3/1245#. This version is in Czech only and portrays data from Czechia 10 years backward. However, the application is under ongoing development driven by the H2020 projects StargateSieusoil, and Smartagrihubs. Therefore a newer version will be presented at the conference. The first design concepts can be seen in the figure below.

Figure 1. - Mockup of Agroclimatic atlas application, accessible from https://xd.adobe.com/view/65199b72-db2f-420a-aee2-bc90dc83aaea-304a/

How to cite: Jedlička, K., Hájek, P., Andrš, T., Čerba, O., Valeš, J., and Zadražil, F.: Agroclimatic atlas - prototype, EMS Annual Meeting 2021, online, 6–10 Sep 2021, EMS2021-475, https://doi.org/10.5194/ems2021-475, 2021.

09:30–09:35
|
EMS2021-496
Frank Kratzenstein and Frank Kaspar

In recent years, the DWD has significantly expanded free access to its climate observations. A first step was a simple FTP site with the possibility to download archives with different data categories, e.g. national and international station-based meteorological data, derived parameters, gridded products, and special categories like phenological data. The data are based on the DWD's observation systems for Germany as well as on the DWD's international activities.

Based on the consistent implementation of OGC standards, an interactive and user-friendly access to the data has been created with the development of the DWD climate portal.

In addition to browsing, previewing, running basic analysis and downloading the data, the available OGC services enable users to set up their own services on the DWD data. Along with the free and extended access to the data and services, the users' demands on the availability, quality, and detail of the metadata also increased significantly. Maintaining and linking metadata to the opendata and services remains a challenge. However, INSPIRE and WIGOS are paving the way to a unified solution and overcoming the problems.

Another challenging requirement was to provide interactive access to long time series from gridded products to the users. To accomplish this, we have moved away from a previously file-based approach to storing the raster data as a georaster in an Oracle database. This design allows us a combined analysis of raster and station data not only in the climate data portal but also in the central climate database.

The presentation will provide a technical and functional overview of the DWD climate data portal.

How to cite: Kratzenstein, F. and Kaspar, F.: Interactive access to climate data from Germany, EMS Annual Meeting 2021, online, 6–10 Sep 2021, EMS2021-496, https://doi.org/10.5194/ems2021-496, 2021.

09:35–09:40
|
EMS2021-423
|
Björn Reetz, Hella Riede, Dirk Fuchs, and Renate Hagedorn

Since 2017, Open Data has been a part of the DWD data distribution strategy. Starting with a small selection of meteorological products, the number of available datasets has grown continuously over the last years. Since the start, users can access datasets anonymously via the website https://opendata.dwd.de to download file-based meteorological products. Free access and the variety of products has been welcomed by the general public as well as private met service providers. The more datasets are provided in a directory structure, however, the more tedious it is to find and select among all available data. Also, metadata and documentation were available, but on separate public websites. This turned out to be an issue, especially for new users of DWD's open data.

To help users explore the available datasets as well as to quickly decide on their suitability for a certain use case, the Open Data team at DWD is developing a geoportal. It enables free-text search along with combined access to data, metadata, and description along with interactive previews via OGC WMS.

Cloud technology is a suitable way forward for hosting the geoportal along with the data in its operational state. Benefits are expected for the easy integration of rich APIs with the geoportal, and the flexible and fast deployment and scaling of optional or prototypical services such as WMS-based previews. Flexibility is also mandatory to respond to fluctuating user demands, depending on time of day and critical weather situations, which is supported by containerization. The growing overall volume of meteorological data at DWD may mandate to allow customers to bring their code to the data – for on-demand processing including slicing and interpolation –  instead of transferring files to every customer. Shared cloud instances are the ideal interface for this purpose.

The contribution will outline a protoype version of the new geoportal and discuss further steps for launching it to the public.

How to cite: Reetz, B., Riede, H., Fuchs, D., and Hagedorn, R.: DWD Geoportal – Converging open data, metadata and documentation in a user-friendly way, EMS Annual Meeting 2021, online, 6–10 Sep 2021, EMS2021-423, https://doi.org/10.5194/ems2021-423, 2021.

09:40–09:55
|
EMS2021-17
|
solicited
|
Claudio Iacopino, James Hawkes, Tiago Quintino, and Baudouin Raoult

Recent adoption of Open Data policies and investments towards Cloud-based platforms have attracted a growing number of consumers of ECMWF data. An example of these initiatives is the European Weather Cloud (EWCloud), where users wish to run automated, real-time tasks or workflows closer to the latest data produced by the model run in the ECMWF HPC facility, thus avoiding costly data transfers out of the data centre. This trend is likely to increase together with the exponential growth of weather forecast data volume. It is expected that in the next few years, taking into account resolution upgrades and more complex model physics, the raw forecast data will exceed a petabyte per day. From an operational perspective, this convergence in the use of HPC and cloud infrastructures is dependent on timely synchronisation with the forecast schedule. A mechanism is needed to notify the consumers of specific data availability in a scalable manner and provide the capability to automatically trigger their workflows based on this data.

To accomplish this, we are developing a system, named "Aviso"1, designed to notify of availability of real-time forecast data or derived products, and to trigger user-defined workflows in automatically. End-users can build their workflow based on events, using a When <this>... Do <that> logic directly linked to ECMWF metadata semantics. The system is composed of a server application based on a persistent key-value store, leveraging modern technologies such as etcd, to provide consistency, transactionality, reliability and scalability to the end-users. The client side is a lightweight Python application providing a CLI interface as well as a Python API for easy integration in the users’ workflows. Finally, the notifications can be exchanged using CloudEvents messages; allowing workflows that span across multiple data centres and cloud-based infrastructures.

This presentation will show how to leverage Aviso for scheduling weather-related workflows in the context of European Horizon 2020 projects (LEXIS and HiDALGO). The LEXIS project focuses on how HPC and cloud systems interact to enable complex workflows, and is demonstrating this concept through three large-scale socio-economic pilots, targeting aeronautics, weather & climate, and catastrophe alert systems. The HiDALGO project focuses on improving data-centric, on-demand computational modelling workflows for accurate policy-making in the domain of Global Challenges, such as human migration, urban air pollution, COVID-19 pandemic and malicious information in social media. Aviso is also a component of the ECMWF's Scalability Programme, and is being introduced in pre-operational status to ensure scalable data availability notification to data consumers.


This work has received funding from the European Union’s H2020 research and innovation programme under grant agreements number 825532 and 824115.

Footnotes:

1) "Aviso" means 'notification' in multiple Latin based languages

How to cite: Iacopino, C., Hawkes, J., Quintino, T., and Raoult, B.: NWP Data availability notifications for meteorological workflows across HPC and Cloud data centres, EMS Annual Meeting 2021, online, 6–10 Sep 2021, EMS2021-17, https://doi.org/10.5194/ems2021-17, 2021.

09:55–10:00
|
EMS2021-322
Angel Alos, Baudouin Raoult, James Varndell, Edward Comyn-Platt, and Chiara Cagnazzo

The Climate (CDS) and Atmosphere (ADS) Data Stores are instances of a common  underlaying infrastructure historically referred as CDS. Data Stores supports the implementation of the Climate Change (C3S) and Atmosphere Monitoring (CAMS) Services under the auspices of Copernicus, the European Union's Earth Observation Programme and entrusted for implementation to the European Centre for Medium-Range Weather Forecasts (ECMWF).  Both are highly visible web-based services currently gathering a vibrant community of users, including policymakers, businesses and scientists, helping them to investigate and tackle climate change and atmosphere monitoring challenges.

CDS infrastructure is designed as a distributed system and an open framework which provides web-based and API-based retrieve facilities to a wide and expanding catalogue of datasets, applications and other digital information. It also provides a development platform (Toolbox) which allow the creation of web-based applications operating on the datasets and products available in the catalogue. These applications are subsequently made available to end-users. Infrastructure is hosted in a dedicated in-house Cloud environment.

Having grown at steady rate in terms of users, functionality, workload and available content since their official opening the infrastructure is now looking forward to be further improved in the coming phase of Copernicus driven by the following objectives:

  • capitalize operational experience, user feedback, lessons learned and know-how from current Data Stores to move into a modern, more reliable and interoperable platform;
  • uptake modernised technologies and standards which have evolved since the initial implementation of the current infrastructure;
  • evolve the system architecture as to take full advantage of cloud computing technologies and underlaying cloud infrastructure as containerization.
  • embrace open source scientific software and ensure compatibility with state-of-the-art solutions such as machine learning, data cubes and interactive notebooks;
  • strengthen synergies with DIAS WEkEO platform and improve the capacity, efficiency, interoperability and reliability of shared interfaces and resources;
  • provide improved and flexible access to data and toolbox capabilities from multiple development platforms;

One of the components at the core of this reengineering exercise will be the Toolbox.  The foundation of this future toolbox implementation will be a suite of quality-assured, open source Python libraries for performing scientific analysis and visualisation, ensuring compatibility with a broader range of Python tools already familiar to the scientific community. Implementation will support two different functioning modes. In one hand a toolbox integrated within the Data Store web portal, providing fast and efficient access to catalogued data by taking full advantage of available computation resources and functionalities provided by the in-house Cloud infrastructure. An in the other a standalone version which will allow users to install and run toolbox software locally.

Platforms mentioned above can be accessed here: Climate Data Store (http://cds.climate.copernicus.eu/), Atmosphere Data Store (http://ads.atmosphere.copernicus.eu/ ), DIAS WEkEO (https://www.wekeo.eu/).

How to cite: Alos, A., Raoult, B., Varndell, J., Comyn-Platt, E., and Cagnazzo, C.: Towards a modernized Copernicus Climate and Atmosphere Data Stores., EMS Annual Meeting 2021, online, 6–10 Sep 2021, EMS2021-322, https://doi.org/10.5194/ems2021-322, 2021.

10:00–10:30

Comments on the session ES1.2

to access the discussion

Supporters & sponsors