ESSI2.6

SMART Monitoring and Integrated Data Exploration of the Earth System

Earth Sciences depend on detailed multi-variate measurements and investigations to understand the physical, geological, chemical, biogeochemical and biological processes of the Earth. Making accurate prognoses and providing solutions for current questions related to climate change, water, energy and food security are important requests towards the Earth Science community worldwide. In addition to these society-driven questions, Earth Sciences are still strongly driven by the eagerness of individuals to understand processes, interrelations and tele-connections within and between small sub-systems and the Earth System as a whole. Understand and predict temporal and spatial changes in the above mentioned Micro- to Earth spanning scales is the key to understand Earth ecosystems; we need to utilize high resolution data across all scales in an integrative/holistic approach. Using Big Data, which are often distributed and particularly very in-homogenous, has become standard practice in Earth Sciences and digitalization in conjunction with Data Science promises new discoveries.
The understanding of the Earth System as a whole and its sub-systems depends on our ability to integrate data from different disciplines, between earth compartments, and across interfaces. The need to advance Data Science capabilities and to enable earth scientists to follow best possible workflows, apply methods, and use computerized tools properly and in an accessible way has been identified worldwide as an important next step for advancing scientific understanding. This is particularly necessary to access knowledge contained in already acquired data, but which due to the limitations of data integration and joint exploration possibilities currently remains invisible. This session aims to bring together researchers from Data and Earth Sciences working on, but not limited to,
• SMART monitoring designs by dealing with advancing monitoring strategies to e.g. detect observational gaps and refine sensor layouts to allow better and statistically robust extrapolation
• Data management and stewardship solutions compliant with FAIR principles, including the development and application of real-time capable data management and processing chains
• Data exploration frameworks providing qualified data from different sources and tailoring available computational and visual methods to explore and analyse multi-parameter data generated through monitoring efforts/ model simulations

Co-organized by GI2
Convener: Jens Greinert | Co-conveners: Everardo González ÁvalosECSECS, Daniela Henkel, Patrick MichaelisECSECS
vPICO presentations
| Tue, 27 Apr, 09:00–10:30 (CEST)

vPICO presentations: Tue, 27 Apr

Chairpersons: Everardo González Ávalos, Patrick Michaelis, Daniela Henkel
09:00–09:05
09:05–09:07
|
EGU21-8361
Uta Koedel, Peter Dietrich, Philipp Fischer, and Claudia Schuetze

The term SMART Monitoring was also defined by the project Digital Earth (DE) , a central joint project of eight Helmholtz centers in Earth and Environment. SMART Monitoring in the sense of DE means that measured environmental parameters and values need to be specific/scalable, measurable/modular, accepted/adaptive, relevant/robust, and trackable/transferable (SMART) for sustainable use as data and improved real data acquisition. SMART Monitoring can be defined as a reliable monitoring approach with machine-learning, and artificial intelligence (A.I.) supported procedures for an “as automated as possible” data flow from individual sensors to databases. SMART Monitoring Tools must include various standardized data flows within the entire data lifecycle, e.g., specific sensor solutions, novel approaches for sampling designs, and defined standardized metadata descriptions. One of the SMART Monitoring workflows' essential components is enhancing metadata with comprehensive information on data quality. On the other hand, SMART Monitoring must be highly modular and adaptive to apply to different monitoring approaches and disciplines in the sciences.

In SMART monitoring, data quality is crucial, not only with respect to data FAIRness. It is essential to ensure data reliability and representativeness. Hence, comprehensively documented data quality is essential and required to enable meaningful data selection for specific data blending, integration, and joint interpretation. Data integration from different sources represents a prerequisite for parameterization and validation of predictive tools or models. This data integration demonstrates the importance of implementing the FAIR principles (Findability, Accessibility, Interoperability, and Reusability) for sustainable data management (Wilkinson et al. 2016). So far, the principle of FAIRdata does not include a detailed description of data quality and does not cover content-related quality aspects. Even though data may be FAIR in terms of availability, it is not necessarily “good" in accuracy and precision. Unfortunately, there is still considerable confusion in science about the definition of good or trustworthy data.

An assessment of data quality and data origin is essential to preclude the possibility of inaccurate, incomplete, or even unsatisfactory data analysis applying, e.g., machine learning methods, and avoid poorly derived, misleading or incorrect conclusions. The terms trustworthiness and representativeness summarise all aspects related to these issues. The central pillars of trustworthiness/representativeness are validity, provenience/provenance, and reliability, which are fundamental features in assessing any data collection or processing step for transparent research. For all kinds of secondary data usage and analysis, a detailed description and assessment of reliability and validity involve an appraisal of applied data collection methods.

The presentation will give exemplary examples to show the importance of data trustworthiness and representativeness evaluation and description, allowing scientists to find appropriate tools and methods for FAIR data handling and more accurate data interpretation.

How to cite: Koedel, U., Dietrich, P., Fischer, P., and Schuetze, C.: Ensuring data trustworthiness within SMART Monitoring of environmental processes, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-8361, https://doi.org/10.5194/egusphere-egu21-8361, 2021.

09:07–09:09
|
EGU21-7747
|
ECS
James Thornton, Elisa Palazzi, Nicholas Pepin, Paolo Cristofanelli, Richard Essery, Sven Kotlarski, Gregory Giuliani, Yaniss Guigoz, Aino Kulonen, Xiaofeng Li, David Pritchard, Hayley Fowler, Christophe Randin, Maria Shahgedanova, Martin Steinbacher, Marc Zebisch, and Carolina Adler

Numerous applications, including generating future predictions via numerical modelling, establishing appropriate policy instruments, and effectively tracking progress against them, require the multitude of complex processes and interactions operating in rapidly changing mountainous environmental systems to be well monitored and understood. At present, however, not only are environmental available data pertaining to mountains often severely limited, but interdisciplinary consensus regarding which variables should be considered absolute observation priorities remains lacking. In this context,  the concept of so-called Essential Mountain Climate Variables (EMCVs) is introduced as a potential means to identify critical observation priorities and thereby ameliorate the situation. Following a brief overview of the most critical aspects of ongoing and expected future climate-driven change in various key mountain system components (i.e. the atmosphere, cryosphere, biosphere and hydrosphere), a preliminary list of corresponding potential EMCVs – ranked according to perceived importance – is proposed. Interestingly, several of these variables do not currently feature amongst the globally relevant Essential Climate Variables (ECVs) curated by GCOS, suggesting this mountain-specific approach is indeed well justified. Thereafter, both established and emerging possibilities to measure, generate, and apply EMCVs are summarised. Finally, future activities that must be undertaken if the concept is eventually to be formalized and widely applied are recommended.

How to cite: Thornton, J., Palazzi, E., Pepin, N., Cristofanelli, P., Essery, R., Kotlarski, S., Giuliani, G., Guigoz, Y., Kulonen, A., Li, X., Pritchard, D., Fowler, H., Randin, C., Shahgedanova, M., Steinbacher, M., Zebisch, M., and Adler, C.: Towards a definition of Essential Mountain Climate Variables, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-7747, https://doi.org/10.5194/egusphere-egu21-7747, 2021.

09:09–09:11
|
EGU21-3951
|
ECS
Erik Nixdorf, Daniel Eggert, Peter Morstein, Thomas Kalbacher, and Doris Dransch

A deeper understanding of the Earth system as a whole and its interacting sub-systems depends, perhaps more than ever, not only on accurate mathematical approximations of the physical processes but also on the availability of environmental data across temporal and spatial scales. Even though advanced numerical simulations and satellite-based remote sensing in conjunction with sophisticated algorithms such as machine learning tools can provide 4D environmental datasets, local and mesoscale measurements continue to be the backbone in many disciplines such as hydrology. Considering the limitations of human and technical resources, monitoring strategies for these types of measurements should be well designed to increase the information gain provided. One helpful set of tools to address these tasks are visual-analytical data exploration frameworks integrating qualified multi-parameter data from different sources and tailoring well-established computational and visual methods to explore and analyze it. In this context, we developed a smart monitoring workflow to determine the most suitable time and location for event-driven, ad-hoc monitoring in hydrology using soil moisture measurements as our target variable.

The Smart Monitoring workflow consists of three main steps. First is the identification of the region of interest, either via user selection or recommendation based on spatial environmental parameters provided by the user. Statistical filters and different color schemes can be applied to highlight potentially relevant regions. During the second step time-dependent environmental parameters (e.g., rainfall and soil moisture estimates of the recent past, weather predictions from numerical weather models and swath forecasts from Earth observation satellites) for those relevant regions can be evaluated to identify suitable time frames for the planned monitoring campaign. Lastly, a detailed assessment of the region of interest is conducted by applying filter and weight functions in combination with multiple linear regressions on selected input parameters. Depending on the measurement objective (e.g highest/lowest values, highest/lowest change), the most suitable areas for monitoring will subsequently be visually highlighted. Based on the common road network an efficient route for a corresponding monitoring campaign can be derived for the identified regions of interest and directly visualized in the visual-analytical environment

How to cite: Nixdorf, E., Eggert, D., Morstein, P., Kalbacher, T., and Dransch, D.: A web-based visual-analytics tool for ad-hoc campaign planning in terrestrial hydrology, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-3951, https://doi.org/10.5194/egusphere-egu21-3951, 2021.

09:11–09:13
|
EGU21-15623
|
ECS
|
Highlight
Valentin Buck, Flemming Stäbler, Everardo Gonzalez, and Jens Greinert

The study of the earth’s systems depends on a large amount of observations from homogeneous sources, which are usually scattered around time and space and are tightly intercorrelated to each other. The understanding of said systems depends on the ability to access diverse data types and contextualize them in a global setting suitable for their exploration. While the collection of environmental data has seen an enormous increase over the last couple of decades, the development of software solutions necessary to integrate observations across disciplines seems to be lagging behind. To deal with this issue, we developed the Digital Earth Viewer: a new program to access, combine, and display geospatial data from multiple sources over time.

Choosing a new approach, the software displays space in true 3D and treats time and time ranges as true dimensions. This allows users to navigate observations across spatio-temporal scales and combine data sources with each other as well as with meta-properties such as quality flags. In this way, the Digital Earth Viewer supports the generation of insight from data and the identification of observational gaps across compartments.

Developed as a hybrid application, it may be used both in-situ as a local installation to explore and contextualize new data, as well as in a hosted context to present curated data to a wider audience.

In this work, we present this software to the community, show its strengths and weaknesses, give insight into the development process and talk about extending and adapting the software to custom usecases.

How to cite: Buck, V., Stäbler, F., Gonzalez, E., and Greinert, J.: The Digital Earth Viewer: A new visualization approach for geospatial time series data, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-15623, https://doi.org/10.5194/egusphere-egu21-15623, 2021.

09:13–09:15
|
EGU21-14621
Louise Darroch, Gardner Thomas, Yelland Margaret, Cardwell Christopher, Slater Emma, Bradshaw Elizabeth, Buck Justin, Jennings Robert, Hale Andrew, and Brown Jennifer

In the UK, £150bn of assets and 4 million people are at risk from coastal flooding. With reductions in public funding, rising sea levels and changing storm conditions, cost-effective and accurate early warning flood forecasting systems are required. However, numerical tools currently used to estimate wave overtopping are based on tank experiments and very limited previous field measurements of total overtopping volumes only. Furthermore, the setting of tolerable hazard thresholds in flood forecasting models requires site-specific information of wave overtopping during storms of varying severity. 

The National Oceanography Centre (NOC) are currently developing a new nowcast wave overtopping alert system that can be deployed in site-specific coastal settings to detect potentially dangerous flood conditions in near real-time (NRT) while validating operational forecasting services. At its core, it utilises a prototype overtopping sensor and an instance of the National Oceanic and Atmospheric Administration’s ERDDAP data server in a self-monitoring and alerting control system. In-situ detection will be performed by WireWall, a novel capacitance wire sensor that measures at the high (400 Hz) frequencies required to obtain the distribution of overtopping volume and horizontal velocity on a wave-by-wave basis. The sensor includes on-board data processing and 2-way telemetry to enable automation and control. The telemetry posts regular health summaries and high-resolution (1 sec) hazard data (produced by the on-board processing) using the standard internet protocol (https) to an open ERDDAP server so data are freely available via an application programming interface (API) alongside other NRT and delayed-mode global coastal ocean and weather information for further data exploration. ERDDAP allows NRT hazard data to be accessed by statistical algorithms and visual applications, as well as receiving alerts that are also fed to messaging queue points (RabbitMQ) that can be monitored by external systems. Combined, this will enable automated health monitoring and sensor operation as well as offer the potential for downstream hazard management tools (such as navigation systems and transport management systems) to ingest the nowcast wave overtopping hazard data. To integrate data with wider systems and different disciplines, ERDDAP data sets will be enriched with common and well-structured metadata. Data provenance, controlled vocabularies, Quality Control and attribution information embedded in the data workflow is fundamental to ensuring user trust in the data and any products generated, while enhancing FAIR data principles. 

The new nowcast wave overtopping alert system will be tested in 2021 during field deployments of multiple WireWall systems at two high energy coastal sites in the UK. Such data are crucial for validating operational flood forecast services as well as protecting local communities and minimising transport service disruptions. The addition of SMART monitoring optimises sensor maintenance and operation, reducing the costs associated with teams travelling to the site. Using ERDDAP embedded with well-structured metadata enables machines to access multiple flood parameters through a single point that abstracts users from the complexities associated with the source data, offering the potential for further data exploration through modelling or techniques such as machine learning. 

How to cite: Darroch, L., Thomas, G., Margaret, Y., Christopher, C., Emma, S., Elizabeth, B., Justin, B., Robert, J., Andrew, H., and Jennifer, B.: The use of ERDDAP in a self-monitoring and nowcast hazard alerting coastal flood system, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-14621, https://doi.org/10.5194/egusphere-egu21-14621, 2021.

09:15–09:17
|
EGU21-16308
Johannes Boog and Thomas Kalbacher

Soil moisture is a crucial variable in the earths critical zone. It depends on multiple factors such as climate, topographic conditions, soil characteristics and affects energy and water fluxes across the land-atmosphere interface and, therefore, is highly important for terrestrial ecosystems, ecosystem management and agriculture. The accurate mapping of soil moisture across time and space is challenging but highly desirable.

One option is to deploy ground-based moisture sensors at the point-scale and to interpolate and/or map the measurements into space. We have developed a data-driven approach to map the soil moisture in a reference area from point measurements at specific time points and the covariates: location, topographic conditions and soil characteristics. We tested the mapping capacity of different two machine-learning algorithms (Random Forest and Neural Networks) and compared those with Ordinary Kriging as standard method. Our questions were: 1) How accurate are the machine-learning methods for soil moisture mapping? 2) Which covariates are most important? 3) How does mapping accuracy vary with data density and temporal resolution? 

We used soil moisture data from the TERENO experimental sites Wüstebach and Rollesbroich located in western Germany. These small catchments are equipped with a dense network of soil moisture sensors using time domain reflectometry (TDR) that has been operated since 2010 (Bogena et al., 2010; Zacharias et al., 2011). From this, we created 2700 point-based soil moisture data sets at specific time points, specific depth and for various numbers of sensor locations. Then we merged these data sets with sampled data on soil texture and chemical composition (Qu et al., 2016; Gottselig, et al., 2017; ) as well as remote sensed terrain data. These time stamp specific point-based soil moisture measurements were mapped using Ordinary Kriging (OK), Random Forest (RF) and Neural Networks (ANN) using combinations of the soil and terrain attributes as well as geometric distances between sensor locations as covariates. Each model was trained (80% subset) and tested (20% subset) on the point-based data sets. 

In general, average model accuracy across the methods and individual data set types (depth, number of sensor locations, temporal averaging) was relatively low with R2 values of approximately 0.2-0.5. This originated in the high variability of soil moisture. Surprisingly, models using the spatial structure of the domain (using distances between sensors as covariates) already yield an R2 of approximately 0.45. Further adding covariates such as soil and terrain attributes did not substantially improve the accuracy for these models. In comparison,using only terrain attributes as covariates for RF and ANN did yield an accuracy of R2 of 0.25-0.27.The trained models were then used to map soil moisture onto the entire study area. This resulted in maps with interesting patterns that differed between the individual methods—even when using same covariate types.

Finally, it can be concluded that for spatial interpolation of soil moisture the Random Forest algorithm using distance between sensor locations as covariates is a promising alternative to Ordinary Kriging from the point of accuracy and simplicity.

How to cite: Boog, J. and Kalbacher, T.: Point to Space: Data-driven Soil Moisture Spatial Mapping using Machine-Learning for Small Catchments in Western Germany, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-16308, https://doi.org/10.5194/egusphere-egu21-16308, 2021.

09:17–09:19
|
EGU21-9290
|
ECS
|
Highlight
Christian Scharun, Roland Ruhnke, Michael Weimer, and Peter Braesicke

Methane (CH4) is the second most important greenhouse gas after CO2 affecting global warming. Various sources (e.g. fossil fuel production, agriculture and waste, biomass burning and natural wetlands) and sinks (the reaction with the OH-radical as the main sink contributes to tropospheric ozone production) determine the methane budget. Due to its long lifetime in the atmosphere methane can be transported over long distances.

Disused and active offshore platforms can emit methane, the amount being difficult to quantify. In addition, explorations of the sea floor in the North Sea showed a release of methane near the boreholes of both, oil and gas producing platforms. The basis of this study is the established emission data base EDGAR (Emission Database for Global Atmospheric Research), an inventory that includes methane emission fluxes in the North Sea region. While methane emission fluxes in the EDGAR inventory and platform locations are matching for most of the oil platforms almost all of the gas platform sources are missing in the database. We develop a method for estimating the missing emission sources based on the EDGAR inventory and the known locations of gas platforms as additional point sources will be inserted in the model.

In this study the global model ICON-ART (ICOsahedral Nonhydrostatic model - Aerosols and Reactive Trace gases) is used. ART is an online-coupled model extension for ICON that includes chemical gases and aerosols. One aim of the model is the simulation of interactions between the trace substances and the state of the atmosphere by coupling the spatiotemporal evolution of tracers with atmospheric processes. ICON-ART sensitivity simulations are performed with inserted and adjusted sources to access their influence on the methane and OH-radical distribution on regional (North Sea) and global scales.

How to cite: Scharun, C., Ruhnke, R., Weimer, M., and Braesicke, P.: Modeling methane from the North Sea region with ICON-ART, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-9290, https://doi.org/10.5194/egusphere-egu21-9290, 2021.

09:19–10:30