Displays

ESSI2.19

Management and integration of environmental observation data

Together with the rapid development of sensor technologies and the implementation of environmental observation networks (e.g. MOSES, TERENO, Digital Earth, eLTER, CUAHSI, ICOS, ENOHA,…) a large number of data infrastructures are being created to manage and provide access to observation data. However, significant advances in earth system understanding can only be achieved through better and easier integration of data from distributed infrastructures. In particular, the development of methods for the automatic real-time processing and integration of observation data in models is required in many applications. The automatic meaningful integration of these data sets is often hindered due to semantic and structural differences between data and poor metadata quality. Improvement in this field strongly depends on the capabilities of dealing with fast growing multi-parameter data and on effort employing data science methods, adapting new algorithms and developing digital workflows tailored to specific scientific needs. Automated quality assessment/control algorithms, data discovery and exploration tools, standardized interfaces and vocabularies as well as data and processing exchange strategies and security concepts are required to interconnecting distributed data infrastructures. Besides the technical integration, also the meaningful integration for different spatial and temporal support or measurement scales is an important aspect. This session focuses on the specific requirements, techniques and solutions to process, provide and couple observation data from (distributed) infrastructures and to make observation data available for modelling and other scientific needs.

Public information:
16:15–16:25: Introduction
16:25–16:29: MOSAiC goes O2A - Arctic Expedition Data Flow from Observations to Archives
16:29–16:33: Implementing a new data acquisition system for the advanced integrated atmospheric observation system KITcube
16:33–16:37: Implementing FAIR principles for dissemination of data from the French OZCAR Critical Observatory network: the Theia/OZCAR information system
16:37–16:47: Discussion
16:47–16:51: Solutions for providing web-accessible, semi-standardised ecosystem research site information
16:51–16:55: Put your models in the web - less painful
16:55–16:59: Improving future optical Earth Observation products using transfer learning
16:59–17:03: Design and Development of Interoperable Cloud Sensor Services to Support Citizen Science Projects
17:03–17:13: Discussion
17:13–17:17: Providing a user-friendly outlier analysis service implemented as open REST API
17:17–17:21: Graph-based river network analysis for rapid discovery and analysis of linked hydrological data
17:21–17:25: SIMILE: An integrated monitoring system to understand, protect and manage sub-alpine lakes and their ecosystem
17:25–17:35: Discussion

Share:

Convener: Dorit KerschkeECSECS | Co-conveners: Benedikt GrälerECSECS, Ralf Kunkel, Anusuriya DevarajuECSECS, Johannes Peterseil

Displays

| Attendance Tue, 05 May, 16:15–18:00 (CEST)

Files for download

Download all presentations (88MB)

Chat time: Tuesday, 5 May 2020, 16:15–18:00

Chairperson: Dorit Kerschke, Ralf Kunkel, Anusuriya Devaraju, Benedikt Gräler, Johannes Peterseil

D868 |

EGU2020-17516

MOSAiC goes O2A - Arctic Expedition Data Flow from Observations to Archives

Antonia Immerz and Angela Schaefer and the AWI Data Centre MOSAiC Team

During the largest polar expedition in history starting in September 2019, the German research icebreaker Polarstern spends a whole year drifting with the ice through the Arctic Ocean. The MOSAiC expedition takes the closest look ever at the Arctic even throughout the polar winter to gain fundamental insights and most unique on-site data for a better understanding of global climate change. Hundreds of researchers from 20 countries are involved. Scientists will use the in situ gathered data instantaneously in near-real time modus as well as long afterwards all around the globe taking climate research to a completely new level. Hence, proper data management, sampling strategies beforehand, and monitoring actual data flow as well as processing, analysis and sharing of data during and long after the MOSAiC expedition are the most essential tools for scientific gain and progress.

To prepare for that challenge we adapted and integrated the research data management framework O2A “Data flow from Observations to Archives” to the needs of the MOSAiC expedition on board Polarstern as well as on land for data storage and access at the Alfred Wegener Institute Computing and Data Center in Bremerhaven, Germany. Our O2A-framework assembles a modular research infrastructure comprising a collection of tools and services. These components allow researchers to register all necessary sensor metadata beforehand linked to automatized data ingestion and to ensure and monitor data flow as well as to process, analyze, and publish data to turn the most valuable and uniquely gained arctic data into scientific outcomes. The framework further allows for the integration of data obtained with discrete sampling devices into the data flow.

These requirements have led us to adapt the generic and cost-effective framework O2A to enable, control, and access the flow of sensor observations to archives in a cloud-like infrastructure on board Polarstern and later on to land based repositories for international availability.

Major roadblocks of the MOSAiC-O2A data flow framework are (i) the increasing number and complexity of research platforms, devices, and sensors, (ii) the heterogeneous interdisciplinary driven requirements towards, e. g., satellite data, sensor monitoring, in situ sample collection, quality assessment and control, processing, analysis and visualization, and (iii) the demand for near real time analyses on board as well as on land with limited satellite bandwidth.

The key modules of O2A's digital research infrastructure established by AWI are implementing the FAIR principles:

SENSORWeb, to register sensor applications and sampling devices and capture controlled meta data before and alongside any measurements in the field
Data ingest, allowing researchers to feed data into storage systems and processing pipelines in a prepared and documented way, at best in controlled near real-time data streams
Dashboards allowing researchers to find and access data and share and collaborate among partners
Workspace enabling researchers to access and use data with research software utilizing a cloud-based virtualized infrastructure that allows researchers to analyze massive amounts of data on the spot
Archiving and publishing data via repositories and Digital Object Identifiers (DOI)

How to cite: Immerz, A. and Schaefer, A. and the AWI Data Centre MOSAiC Team: MOSAiC goes O2A - Arctic Expedition Data Flow from Observations to Archives, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-17516, https://doi.org/10.5194/egusphere-egu2020-17516, 2020.

D869 |

EGU2020-10431

Implementing a new data acquisition system for the advanced integrated atmospheric observation system KITcube

Martin Kohler, Mahnaz Fekri, Andreas Wieser, and Jan Handwerker

KITcube (Kalthoff et al, 2013) is a mobile advanced integrated observation system for the measurement of meteorological processes within a volume of 10x10x10 km³. A large variety of different instruments from in-situ sensors to scanning remote sensing devices are deployed during campaigns. The simultaneous operation and real time instrument control needed for maximum instrument synergy requires a real-time data management designed to cover the various user needs: Save data acquisition, fast loading, compressed storage, easy data access, monitoring and data exchange. Large volumes of data such as raw and semi-processed data of various data types, from simple ASCII time series to high frequency multi-dimensional binary data provide abundant information, but makes the integration and efficient management of such data volumes to a challenge.
Our data processing architecture is based on open source technologies and involves the following five sections: 1) Transferring: Data and metadata collected during a campaign are stored on a file server. 2) Populating the database: A relational database is used for time series data and a hybrid database model for very large, complex, unstructured data. 3) Quality control: Automated checks for data acceptance and data consistency. 4) Monitoring: Data visualization in a web-application. 5) Data exchange: Allows the exchange of observation data and metadata in specified data formats with external users.
The implemented data architecture and workflow is illustrated in this presentation using data from the MOSES project (http://moses.eskp.de/home).

References:

KITcube - A mobile observation platform for convection studies deployed during HyMeX .
Kalthoff, N.; Adler, B.; Wieser, A.; Kohler, M.; Träumner, K.; Handwerker, J.; Corsmeier, U.; Khodayar, S.; Lambert, D.; Kopmann, A.; Kunka, N.; Dick, G.; Ramatschi, M.; Wickert, J.; Kottmeier, C.
2013. Meteorologische Zeitschrift, 22 (6), 633–647. doi:10.1127/0941-2948/2013/0542

How to cite: Kohler, M., Fekri, M., Wieser, A., and Handwerker, J.: Implementing a new data acquisition system for the advanced integrated atmospheric observation system KITcube, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-10431, https://doi.org/10.5194/egusphere-egu2020-10431, 2020.

D870 |

EGU2020-3708

Implementing FAIR principles for dissemination of data from the French OZCAR Critical Observatory network: the Theia/OZCAR information system

Isabelle Braud, Véronique Chaffard, Charly Coussot, Sylvie Galle, and Rémi Cailletaud

OZCAR-RI, the French Critical Zone Research Infrastructure gathers 20 observatories sampling various compartments of the Critical Zone, and having historically developed their own data management and distribution systems. However, these efforts have generally been conducted independently. This has led to a very heterogeneous situation, with different levels of development and maturity of the systems and a general lack of visibility of data from the entire OZCAR-RI community. To overcome this difficulty, a common Information System (Theia/OZCAR IS) was built to make these in situ observation FAIR (Findable, Accessible, Interoperable, Reusable). The IS will allow the data to be visible in the European eLTER-RI (European Long Term Ecosystem Research) Research Infrastructure to which OZCAR-RI contributes.

The IS architecture was designed after consultation of the users, data producers and IT teams involved in data management. A common data model including all the requested information and based on several metadata standards was defined to set up information fluxes between observatories IS and the Theia/OZCAR IS. Controlled vocabularies were defined to develop a data discovery web portal offering a faceted search with various criteria, including variables names and categories that were harmonized in a thesaurus published on the web. The communication will describe the IS architecture, the pivot data model and open source solutions used to implement the data portal that allows data discovery. The communication will also present future steps to implement data downloading and interoperability services that will allow a full implementation of these FAIR principles.

How to cite: Braud, I., Chaffard, V., Coussot, C., Galle, S., and Cailletaud, R.: Implementing FAIR principles for dissemination of data from the French OZCAR Critical Observatory network: the Theia/OZCAR information system, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-3708, https://doi.org/10.5194/egusphere-egu2020-3708, 2020.

D871 |

EGU2020-5210

Solutions for providing web-accessible, semi-standardised ecosystem research site information

Christoph Wohner, Johannes Peterseil, Tomáš Kliment, and Doron Goldfarb

There are a number of systems dedicated to the storage of information about ecosystem research sites, often used for the management of such facilities within research networks or research infrastructures. If such systems provide interfaces for querying this information, these interfaces and especially their data formats may vary greatly with no established data format standard to follow.

DEIMS-SDR (Dynamic Ecological Information Management System - Site and Dataset Registry; https://deims.org) is one such service that allows registering and discovering long-term ecosystem research sites, along with the data gathered at those sites and networks associated with them. We present our approach to make the hosted information openly available via a REST-API. While this allows flexibility in the way information is structured, it also follows interoperability standards and specifications that provide clear rules on how to parse this information.

The REST-API follows the OpenAPI 3.0 specification, including the usage of JSON schemas for describing the exact structure of available records. In addition, DEIMS-SDR also issues persistent, unique and resolvable identifiers for sites independent of the affiliation with research infrastructures or networks.

The flexible design of the DEIMS-SDR data model and the underlying REST-API based approach provide a low threshold for incorporating information from other research domains within the platform itself as well as integrating its exposed metadata with third party information through external means.

How to cite: Wohner, C., Peterseil, J., Kliment, T., and Goldfarb, D.: Solutions for providing web-accessible, semi-standardised ecosystem research site information, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-5210, https://doi.org/10.5194/egusphere-egu2020-5210, 2020.

D872 |

EGU2020-8671

Put your models in the web - less painful

Nils Brinckmann, Massimiliano Pittore, Matthias Rüster, Benjamin Proß, and Juan Camilo Gomez-Zapata

Today's Earth-related scientific questions are more complex and more interdisciplinary than ever, so much that is extremely challenging for single-domain experts to master all different aspects of the problem at once. As a consequence, modular and distributed frameworks are increasingly gaining momentum, since they allow the collaborative development of complex, multidisciplinary processing solutions.

A technical implementation focus on the use of modern web technologies with their broad variety of standards, protocols and available development frameworks. RESTful services - one of the main drivers of the modern web - are often sub optimal for the implementation of complex scientific processing solutions. In fact, while they offer great flexibility, they also tend to be bound to very specific formats (and often poorly documented).

With the introduction of the Web Processing Service (WPS) specifications, the Open Geospatial Consortium (OGC) proposed a standard for the implementation of a new generation of computing modules overcoming most of the drawbacks of the RESTful approach. The WPS allow a flexible and reliable specification of input and output formats as well as the exploration of the services´capabilities with the GetCapabilities and DescribeProcess operations.

The main drawback of the WPS approach with respect to RESTful services is that the latter can be easily implemented for any programming language, while the efficient integration of WPS is currently mostly relying on Java, C and Python implementations. In the framework of Earth Science Research we are often confronted with a plethora of programming languages and coding environments. Converting already existing complex scientific programs into a language suitable for WPS integration can be a daunting effort and may even result in additional errors being introduced due to conflicts and misunderstandings between the original code authors and the developers working on the WPS integration. Also the maintenance of these hybrid processing components is often very difficult since most scientists are not familiar with web programming technologies and conversely the web developers cannot (or do not have the time to) get adequately acquainted with the underlying science.

Facing these problems in the context of the RIESGOS project we developed a framework for a Java-based WPS server able to run any kind of scientific code scripts or command line programs. The proposed approach is based on the use of Docker containers encapsulating the running processes, and Docker images to manage all necessary dependencies.

A simple set of ASCII configuration files provides all information needed for WPS integration: how to call the program, how to give input parameters - including command line arguments and input files - and how to interpret the output of the program - both from stdout and from serialized files. There are a bunch of predefined format converters and we also include mechanisms for extensions to allow maximum flexibility.

The result is a encapsulated, modular, safe and extendable architecture that allows scientists to expose their scientific programs on the web with little effort, and to collaboratively create complex, multidisciplinary processing pipelines.

How to cite: Brinckmann, N., Pittore, M., Rüster, M., Proß, B., and Gomez-Zapata, J. C.: Put your models in the web - less painful, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-8671, https://doi.org/10.5194/egusphere-egu2020-8671, 2020.

Discussion

D873 |

EGU2020-9983

Improving future optical Earth Observation products using transfer learning

Peter Kettig, Eduardo Sanchez-Diaz, Simon Baillarin, Olivier Hagolle, Jean-Marc Delvit, Pierre Lassalle, and Romain Hugues

Pixels covered by clouds in optical Earth Observation images are not usable for most applications. For this reason, only images delivered with reliable cloud masks are eligible for an automated or massive analysis. Current state of the art cloud detection algorithms, both physical models and machine learning models, are specific to a mission or a mission type, with limited transferability. A new model has to be developed every time a new mission is launched. Machine Learning may overcome this problem and, in turn obtain state of the art, or even better performances by training a same algorithm on datasets from different missions. However, simulating products for upcoming missions is not always possible and available actual products are not enough to create a training dataset until well after the launch. Furthermore, labelling data is time consuming. Therefore, even by the time when enough data is available, manually labelled data might not be available at all.

To solve this bottleneck, we propose a transfer learning based method using the available products of the current generation of satellites. These existing products are gathered in a database that is used to train a deep convolutional neural network (CNN) solely on those products. The trained model is applied to images from other - unseen - sensors and the outputs are evaluated. We avoid labelling manually by automatically producing the ground data with existing algorithms. Only a few semi-manually labelled images are used for qualifying the model. Even those semi-manually labelled samples need very few user inputs. This drastic reduction of user input limits subjectivity and reduce the costs.

We provide an example of such a process by training a model to detect clouds in Sentinel-2 images, using as ground-truth the masks of existing state-of-the-art processors. Then, we apply the trained network to detect clouds in previously unseen imagery of other sensors such as the SPOT family or the High-Resolution (HR) Pleiades imaging system, which provide a different feature space.

The results demonstrate that the trained model is robust to variations within the individual bands resulting from different acquisition methods and spectral responses. Furthermore, the addition of geo-located auxiliary data that is independent from the platform, such as digital elevation models (DEMs), as well as simple synthetic bands such as the NDVI or NDSI, further improves the results.

In the future, this approach opens up the possibility to be used on new CNES’ missions, such as Microcarb or CO3D.

How to cite: Kettig, P., Sanchez-Diaz, E., Baillarin, S., Hagolle, O., Delvit, J.-M., Lassalle, P., and Hugues, R.: Improving future optical Earth Observation products using transfer learning, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-9983, https://doi.org/10.5194/egusphere-egu2020-9983, 2020.

D874 |

EGU2020-13338

Design and Development of Interoperable Cloud Sensor Services to Support Citizen Science Projects

Henning Bredel, SImon Jirka, Joan Masó Pau, and Jaume Piera

Citizen Observatories are becoming a more and more popular source of input data in many scientific domains. This includes for example research on biodiversity (e.g. counts of specific species in an area of interest), air quality monitoring (e.g. low-cost sensor boxes), or traffic flow analysis (e.g. apps collecting floating car data).

For the collection of such data, different approaches exist. Besides frameworks providing re-usable software building blocks (e.g. wq framework, Open Data Kit), many projects rely on custom developments. However, these solutions are mainly focused on providing the necessary software components. Further work is necessary to set-up the necessary IT infrastructure. In addition, aspects such as interoperability are usually less considered which often leads to the creation of isolated information silos.

In our presentation, we will introduce selected activities of the European H2020 project COS4CLOUD (Co-designed citizen observatories for the EOS-Cloud). Among other objectives, COS4CLOUD aims at providing re-usable services for setting up Citizen Observatories based on the European Open Science (EOS) Cloud. We will especially discuss how it will make use of interoperability standards such as the Sensor Observation Service (SOS), SensorThings API as well as Observations and Measurements (O&M) of the Open Geospatial Consortium (OGC).

As a result, COS4CLOUD will not only facilitate the collection of Citizen Observatory data by reducing the work necessary to set-up a corresponding IT infrastructure. It will also support the exchange and integration of Citizen Observatory data between different projects as well as the integration with other authoritative data sources. This shall increase the sustainability of data collection efforts as Citizen Science data may be used as input for many data analysis processes beyond the project that originally collected the data.

How to cite: Bredel, H., Jirka, S., Masó Pau, J., and Piera, J.: Design and Development of Interoperable Cloud Sensor Services to Support Citizen Science Projects, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-13338, https://doi.org/10.5194/egusphere-egu2020-13338, 2020.

D875 |

EGU2020-14903

Providing a user-friendly outlier analysis service implemented as open REST API

Doron Goldfarb, Johannes Kobler, and Johannes Peterseil

As outliers in any data set may have detrimental effects on further scientific analysis, the measurement of any environmental parameter and the detection of outliers within these data are closely linked. However, outlier analysis is complicated, as the definition of an outlier is controversially discussed and thus - until now - vague. Nonetheless, multiple methods have been implemented to detect outliers in data sets. The application of these methods often requires some statistical know-how.

The present use case, developed as proof-of-concept implementation within the EOSC-Hub project, is dedicated to providing a user-friendly outlier analysis web-service via an open REST API processing environmental data either provided via Sensor Observation Service (SOS) or stored as data files in a cloud-based data repository. It is driven by an R-script performing the different operation steps consisting of data retrieval, outlier analysis and final data export. To cope with the vague definition of an outlier, the outlier analysis step applies numerous statistical methods implemented in various R-packages.

The web-service encapsulates the R-script behind a REST API which is decribed by a dedicated OpenAPI specification defining two distinct access methods (i.e. SOS- and file-based) and the required parameters to run the R-script. This formal specification is subsequently used to automatically generate a server stub based on the Python FLASK framework which is customized to execute the R-script on the server whenever an appropriate web request arrives. The output is currently collected in a ZIP file which is returned after each successful web request. The service prototype is designed to be operated using generic resources provided by the European Open Science Cloud (EOSC) and the European Grid Initiative (EGI) in order to ensure sustainability and scalability.

Due to its user-friendliness and open availability, the presented web-service will facilitate access to standardized and scientifically-based outlier analysis methods not only for individual scientists but also for networks and research infrastructures like eLTER. It will thus contribute to the standardization of quality control procedures for data provision in distributed networks of data providers.

Keywords: quality assessment, outlier detection, web service, REST-API, eLTER, EOSC, EGI, EOSC-Hub

How to cite: Goldfarb, D., Kobler, J., and Peterseil, J.: Providing a user-friendly outlier analysis service implemented as open REST API, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-14903, https://doi.org/10.5194/egusphere-egu2020-14903, 2020.

D876 |

EGU2020-17318

Graph-based river network analysis for rapid discovery and analysis of linked hydrological data

Matt Fry and Jan Rosecký

Hydrological analyses generally require information from locations across a river system, and knowledge on how these locations are linked within that system. Hydrological monitoring data e.g. from sensors or samples of the status of river flow and water quality, and datasets on factors influencing this status e.g. sewage treatment input, riparian land use, lakes, abstractions, etc., are increasingly available as open datasets, sometimes via web-based APIs. However, retrieving information, for data discovery or for direct analysis, based on location within the river system is complex, and is therefore not a common feature of APIs for hydrological data.

We demonstrate an approach to extracting datasets based on river connectivity using a digital river network for the UK, converted to a directed graph, and the python networkX package. This approach enables very rapid identification of upstream and downstream reaches and features for sites of interest, with speeds suitable for on-the-fly analysis. We describe how such an approach could be deployed within an API for data discovery and data retrieval, and demonstrate linking data availability information, capturing observed properties and time series metadata, from large sensor networks, in a JSON-LD format based on concepts drawn from SSN/SOSA and INSPIRE EMF. This approach has been applied to identify up- and downstream water quality monitoring sites for lakes within the UK Lakes Database for nutrient retention analysis, and production of hierarchical datasets of river flow gauging stations to aide network understanding.

How to cite: Fry, M. and Rosecký, J.: Graph-based river network analysis for rapid discovery and analysis of linked hydrological data, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-17318, https://doi.org/10.5194/egusphere-egu2020-17318, 2020.

D877 |

EGU2020-19393

SIMILE: An integrated monitoring system to understand, protect and manage sub-alpine lakes and their ecosystem

Daniele Strigaro, Massimiliano Cannata, Fabio Lepori, Camilla Capelli, Michela Rogora, and Maria Brovelli

Lakes are an invaluable natural and economic resource for the insubric area, identified as the geographical area between the Po River (Lombardy, Italy) and the Monte Ceneri (Ticino, Switzerland). However, the increased anthropic activity and the climate change impacts are more and more threatening the health of these resources. In this context, universities and local administrations of the two regions, that share the trans-boundary lakes, joined their efforts and started a project, named SIMILE, to develop a system for the monitoring of lakes’ status providing updated and continuous information to support the management of the lakes. This project results from a pluriannual collaboration between the two countries, Switzerland and Italy, formalized in the CIPAIS commission (www.cipais.org). The aim is to introduce an innovative information system based on the combination of advanced automatic and continuous observation system, high resolution remote sensing data processing, citizen science and ecological and physical models. The project will capitalize the knowledge and experience of the resource managers with the creation of a Business Intelligence platform based on several interoperable geospatial Web services. The use of Open software and data will facilitate its adoption and will contribute to adequately keep the costs limited. The project, started few months ago is here presented and discussed.

How to cite: Strigaro, D., Cannata, M., Lepori, F., Capelli, C., Rogora, M., and Brovelli, M.: SIMILE: An integrated monitoring system to understand, protect and manage sub-alpine lakes and their ecosystem, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-19393, https://doi.org/10.5194/egusphere-egu2020-19393, 2020.

D878 |

EGU2020-19453

Accessing environmental time series data in R from Sensor Observation Services with ease

Daniel Nüst, Eike H. Jürrens, Benedikt Gräler, and Simon Jirka

Time series data of in-situ measurements is the key to many environmental studies. The first challenge in any analysis typically arises when the data needs to be imported into the analysis framework. Standardisation is one way to lower this burden. Unfortunately, relevant interoperability standards might be challenging for non-IT experts as long as they are not dealt with behind the scenes of a client application. One standard to provide access to environmental time series data is the Sensor Observation Service (SOS, ) specification published by the Open Geospatial Consortium (OGC). SOS instances are currently used in a broad range of applications such as hydrology, air quality monitoring, and ocean sciences. Data sets provided via an SOS interface can be found around the globe from Europe to New Zealand.

The R package sos4R (Nüst et al., 2011) is an extension package for the R environment for statistical computing and visualization (), which has been demonstrated a a powerful tools for conducting and communicating geospatial research (cf. Pebesma et al., 2012; ). sos4R comprises a client that can connect to an SOS server. The user can use it to query data from SOS instances using simple R function calls. It provides a convenience layer for R users to integrate observation data from data access servers compliant with the SOS standard without any knowledge about the underlying technical standards. To further improve the usability for non-SOS experts, a recent update to sos4R includes a set of wrapper functions, which remove complexity and technical language specific to OGC specifications. This update also features specific consideration of the OGC SOS 2.0 Hydrology Profile and thereby opens up a new scientific domain.

In our presentation we illustrate use cases and examples building upon sos4R easing the access of time series data in an R and Shiny () context. We demonstrate how the abstraction provided in the client library makes sensor observation data for accessible and further show how sos4R allows the seamless integration of distributed observations data, i.e., across organisational boundaries, into transparent and reproducible data analysis workflows.

References

Nüst D., Stasch C., Pebesma E. (2011) Connecting R to the Sensor Web. In: Geertman S., Reinhardt W., Toppen F. (eds) Advancing Geoinformation Science for a Changing World. Lecture Notes in Geoinformation and Cartography, Springer.

Pebesma, E., Nüst, D., & Bivand, R. (2012). The R software environment in reproducible geoscientific research. Eos, Transactions American Geophysical Union, 93(16), 163–163.

How to cite: Nüst, D., Jürrens, E. H., Gräler, B., and Jirka, S.: Accessing environmental time series data in R from Sensor Observation Services with ease, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-19453, https://doi.org/10.5194/egusphere-egu2020-19453, 2020.

D879 |

EGU2020-21575

Flood Monitoring using ACube - An Austrian Data Cube Solution

Claudio Navacchi, Bernhard Bauer-Marschallinger, and Wolfgang Wagner

Geospatial data come in various formats and originate from different sensors and data providers. This poses a challenge to users when aiming to combine or simultaneously access them. To overcome these obstacles, an easy-to-use data cube solution was designed for the Austrian user community and gathers various relevant and near real-time datasets. Here we show how such a system can be used for flood monitoring.

In 2018, a joint project between the Earth Observation Data Centre for Water Resource Monitoring (EODC), TU Wien and BOKU has led to the emergence of the Austrian Data Cube (ACube). ACube implements the generic Python software from Open Data Cube, but further tailors it to national needs of Austrian ministries, universities or smaller companies. With user-driven input coming from all these partners, datasets and metadata attributes have been defined to facilitate query operations and data analysis. A focus was put on high-resolution remote sensing data from the Copernicus programme. This includes C-band radar backscatter, various optical bands, Surface Soil Moisture (SSM), Normalized Difference Vegetation Index (NDVI), Leaf Area Index (LAI), Fraction of Absorbed Photosynthetically Active Radiation (fAPAR), and monthly composites with pixel spacings varying between 10 and 500m. Static data like a digital elevation model (DEM), i.e. the EU-DEM, also reside next to the aforementioned dynamic datasets. Moreover, ACube offers different possibilities for data visualisation through QGIS or JupyterHub and, most importantly, enables access to a High Performance Computing (HPC) environment connected to a Petabyte-scale storage.

The ACube, as a centralised platform and interface to high-resolution datasets, prepares ground for many applications, e.g., land cover classification, snow melt monitoring, grassland yield estimation, land slide and flood detection. With a focus on the latter use case, first analyses based on Sentinel-1 radar backscatter data have already shown promising results. A near real-time fusion of radar, optical and ancillary data (DEM, land cover, etc.) through machine learning techniques could further improve an indication of flood events. Building a dedicated web service is foreseen as an upcoming action, relying on the latest data and the HPC environment in the background. Such an emergency service would provide much potential for authorities and users to assess damages, and also to determine vulnerability to progressing flooding.

The study received funding from the cooperative R&D FFG ASAP 14 project 865999 "Austrian Data Cube".

How to cite: Navacchi, C., Bauer-Marschallinger, B., and Wagner, W.: Flood Monitoring using ACube - An Austrian Data Cube Solution , EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-21575, https://doi.org/10.5194/egusphere-egu2020-21575, 2020.

Discussion

D880 |

EGU2020-21714

Evolution of data infrastructure for effective integration and management of environmental and ecosystem data

Siddeswara Guru, Gerhard Weis, Wilma Karsdorp, Andrew Cleland, Jenny Mahuika, Edmond Chuc, Javier Sanchez Gonzalez, and Mosheh Eliyahu

The Terrestrial Ecosystem Research Network (TERN) is Australia's national research infrastructure to observe, monitor and support the study and forecasting of continental-scale ecological changes. TERN data are classified under two themes: Ecology and Biogeophysical.

The Ecology theme relates predominantly to plot-based ecological observations conducted as a one-off, repeated surveys and sensor-based measurements. The Biogeophysical theme-related data collections are inclusive of point-based time-series eddy-covariance based micrometeorological measurements from flux towers; and continental and regional scale gridded data products related to remote sensing, soil and landscape ecology.

Integrating and querying data from different data sources are complicated. Furthermore,

The advancement of technology has transformed the mode of data collection. For instance, mobile sensors (drones) of various sizes are used more in recent times to sample the environment. The user-centric data handling mechanisms of different types of datasets are dissimilar, requiring heterogeneous data management practices alongside ease of access to data for users bundled with tools and platforms to interrogate, access, analyse and share analysis pipelines.

TERN is developing data e-infrastructure to support holistic capabilities that not only manage to store, curate and distribute data. But, enable processing based on user needs, linking consistent data to various analysis tools and pipelines and acquisition of data skills. The infrastructure would allow collaboration with other national and international data infrastructures and ingest data from partners including state and federal government institutes by adopting domain standards for metadata and data management and publications.

For effective data management of plot-based ecology data, we have developed an ontology-based on O&M and Semantic Sensor Network Ontology with an extension to support basic concepts of ecological sites and sampling. Besides, controlled vocabularies for observed properties, observation procedures and standard lists for taxa, geology, soils etc. will supplement the ontology.

The biogeophysical data is managed using domain standards in the data and metadata management. Each of the data products is represented in a standard file format and hosted in an OGC standard web services. All datasets are described and catalogued using ISO standards. An overarching discovery portal allows users to search, access and interact with data collections. The user’s interaction with data can be at the collection level, on a spatial map and via web services and Application Programming Interface (API).

TERN has also developed a cloud-based virtual desktop environment, CoESRA, accessible from a web browser to enable easy access to the computing platform with tools for the ecosystem science community. The advantage is that it allows access to all TERN data in a compute environment for performing analysis and synthesis activities from a single managed platform.

How to cite: Guru, S., Weis, G., Karsdorp, W., Cleland, A., Mahuika, J., Chuc, E., Sanchez Gonzalez, J., and Eliyahu, M.: Evolution of data infrastructure for effective integration and management of environmental and ecosystem data , EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-21714, https://doi.org/10.5194/egusphere-egu2020-21714, 2020.