ESSI2 – Infrastructures across the Earth and Space Sciences
Metadata, Data Models, Semantics, and Collaboration
Earth systems science is fundamentally cross-disciplinary, and increasingly this requires sharing and exchange of geoscientific information across discipline boundaries. This information can be both rich and complex, and content is not always readily interpretable by either humans or machines. Difficulties arise through differing exchange formats, lack of common semantics, divergent access mechanisms, etc.
Recent developments in distributed, service-oriented, information systems using web-based (W3C, ISO, OGC) standards are leading to advances in data interoperability. At the same time, work is underway to understand how meaning may be represented using ontologies and other semantic mechanisms, and how this can be shared with other scientists.
This session aims to explore developments in interoperable data sharing, and the representation of semantic meaning to enable interpretation of geoscientific information. Topics may include, but are not limited to:
- standards-based information modelling
- interoperable data sharing
- use of metadata
- knowledge representation
- use of semantics in an interoperability context
- application of semantics to discovery and analysis
- metadata and collaboration
Please Note: abstracts chosen for presentation during the ESSI 2.1 session will be considered for publication in a Special Issue of (IJGI) International Journal of Geo-Information: https://www.mdpi.com/journal/ijgi, titled "On Denotation and Connotation in Web Semantics, Collaboration and Metadata” More in formation at this link: https://www.mdpi.com/journal/ijgi/special_issues/denotation_connotation
There are many ways in which machine learning promises to provide insight into the Earth System, and this area of research is developing at a breathtaking pace. If unsupervised, supervised as well as reinforcement learning can hold this promise remains an open question, particularly for predictions. Machine learning could help extract information from numerous Earth System data, such as satellite observations, as well as improve model fidelity through novel parameterisations or speed-ups. This session invites submissions spanning modelling and observational approaches towards providing an overview of the state-of-the-art of the application of these novel methods.
This session aims to bring together researchers working with big data sets generated from monitoring networks, extensive observational campaigns and detailed modeling efforts across various fields of geosciences. Topics of this session will include the identification and handling of specific problems arising from the need to analyze such large-scale data sets, together with methodological approaches towards semi or fully automated inference of relevant patterns in time and space aided by computer science-inspired techniques. Among others, this session shall address approaches from the following fields:
• Dimensionality and complexity of big data sets
• Data mining in Earth sciences
• Machine learning, deep learning and Artificial Intelligence applications in geosciences
• Visualization and visual analytics of big and high-dimensional data
• Informatics and data science
• Emerging big data paradigms, such as datacubes
|AttendanceThu, 07 May, 08:30–12:30 (CEST),
AttendanceThu, 07 May, 14:00–15:45 (CEST)
Data Science and Machine Learning for Natural Hazards and Seismology
Smart monitoring and observation systems for natural hazards, including satellites, seismometers, global networks, unmanned vehicles (e.g., UAV), and other linked devices, have become increasingly abundant. With these data, we observe the restless nature of our Earth and work towards improving our understanding of natural hazard processes such as landslides, debris flows, earthquakes, floods, storms, and tsunamis. The abundance of diverse measurements that we have now accumulated presents an opportunity for earth scientists to employ statistically driven approaches that speed up data processing, improve model forecasts, and give insights into the underlying physical processes. Such big-data approaches are supported by the wider scientific, computational, and statistical research communities who are constantly developing data science and machine learning techniques and software. Hence, data science and machine learning methods are rapidly impacting the fields of natural hazards and seismology. In this session, we will see research from natural hazards and seismology for processes over a broad range of time and spatial scales.
Dr. Pui Anantrasirichai of the University of Bristol, UK will give the invited presentation:
Application of Deep Learning to Detect Ground Deformation in InSAR Data
Machine learning (ML) is now widely used across the Earth Sciences and especially its subfield deep learning (DL) has recently enjoyed increased attention in the context of Hydrology. The goal of this session is to highlight the continued integration of ML, and DL in particular, into traditional and emerging Hydrology-related workflows. Abstracts are solicited related to novel theory development, novel methodology, or practical applications of ML and DL in Hydrology. This might include, but is not limited to, the following:
(1) Identifying novel ways for DL in hydrological modelling.
(2) Testing and examining the usability of DL based approaches in hydrology.
(3) Improving understanding of the (internal) states/representations of DL models.
(4) Integrating DL with traditional hydrological models.
(5) Creating an improved understanding of the conditions for which DL provides reliable simulations. Including quantifying uncertainty in DL models.
(6) Clustering and/or classifying hydrologic systems, events and regimes.
(7) Using DL for detecting, quantifying or cope with nonstationarity in hydrological systems and modeling.
(8) Deriving scaling relationships or process-related insights directly from DL.
(8) Using DL to model or anticipate human behavior or human impacts on hydrological systems.
(10) DL based hazard analysis, detection/mitigation, event detection, etc.
(11) Natural Language Processing to analyze, interpret, or condense hydrologically-relevant peer-reviewed literature or social media data or to assess trends within the discipline.
Data Integration: Enabling the Acceleration of Science Through Connectivity, Collaboration, and Convergent Science
Earth, space, and environmental scientists are pushing the boundaries of human understanding of our complex world. They seek and use larger and more varied data in their research with a growing need for data integration and synthesis across and among scientific domains. Tools, services, and data skills are critical resources to the research ecosystem in order to enable the harmonization and integration of data with different temporal and spatial ranges. This session explores the challenges, successes, and best practices the data community has for using data from multiple sources and scientific domains with unfamiliar formats, vocabularies, quality, and uncertainty, or in providing support and services for accessing these data. We seek submissions from the community of data producers, enablers, researchers, and users on methods for identifying and communicating best practices, challenges in this diverse data environment, and for building critical data skills related to data integration and data management.
Earth/Environmental Science Applications on HPC and Cloud Infrastructures
This session aims to highlight Earth Science research concerned with state of the art computational and data infrastructures such as HPC (Supercomputer, Cluster, accelerator-based systems GPGU, FPGA), Clouds and accelerator-based systems (GPGPU, FPGA).
We will focus on data intensive workflows (scientific workflows) between Infrastructures e.g. European data and compute infrastructures down to complex analysis workflows on an HPC system e.g. in situ coupling frameworks.
The session presents an opportunity for everyone to present and learn from results achieved, success stories and experience gathered during the process of study, adaptation and exploitation of these systems.
Further contributions are welcome that showcase middleware and tools developed to support Earth Science applications on HPC systems and Cloud infrastructures, e.g. to increase effectivity, robustness or ease of use.
Topics of interest include:
- Data intensive Earth Science applications and how they have been adapted to different HPC infrastructures
- Data mining software stacks in use for large environmental data
- HPC simulation and High Performance Data Analytics e.g. code coupling, in-situ workflows
- Experience with Earth Science applications in Cloud environments e.g. solutions on Amazon EC2, Microsoft Azure, and Earth Science simulation codes in private and European Cloud infrastructures (Open Science Cloud)
- Tools and services for Earth Science data management, workflow execution, web services and portals to ease access to compute resources.
- Tools and middleware for Earth Science applications on Grid, Cloud and on High Performance Computing infrastructures.
MATLAB-based programs, applications and technical resources for Geoscience Research
This session provides a multi-disciplinary overview of Geoscience research and applied case studies involving MATLAB, and it further discusses technical resources and new capabilities available to researchers and educators. MATLAB is a multi-paradigm numerical computing environment and programming language developed by MathWorks, which is supported by a large community of skilled toolbox developers and active users. It allows matrix manipulations, data plotting, algorithms implementation, creation of user interfaces, and interfacing with programs written in other programming languages. These characteristics of MATLAB functions and tools have attracted various projects in geoscientific fields of academia and industry, and particularly in data analysis, 2D/3D visualization and program development. Many scientific articles, including MATLAB-based applications, have been published in international journals. This session encourages studies introducing/applying MATLAB-based programs and applications. Contributions from all related fields of Earth Science are welcome.
In this session, we would like to provide an overview over the MATLAB ecosystem for geoscientists and engineers and to discuss recent technological developments. Useful techniques will be introduced to manage large distributed files and leverage cluster solutions for geoscientific computations. We will focus on the visualization of results for scientific publication and present state of the art capabilities to visualize geo-referenced data.
Applications of data, methods and models in geosciences
The aim of this session is to present the latest research and case studies related to various data analysis and improvement methods and modeling techniques, and demonstrate their applications from the various fields of earth sciences like: hydrology, geology and paleogeomorphology, to geophysics, seismology, environmental and climate change.
Complex geoscientific time series: linear, nonlinear, and computer science perspectives
This interdisciplinary session welcomes contributions on novel conceptual approaches and methods for the analysis of observational as well as model time series from all geoscientific disciplines.
Methods to be discussed include, but are not limited to:
- linear and nonlinear methods of time series analysis
- time-frequency methods
- predictive approaches
- statistical inference for nonlinear time series
- nonlinear statistical decomposition and related techniques for multivariate and spatio-temporal data
- nonlinear correlation analysis and synchronisation
- surrogate data techniques
- filtering approaches and nonlinear methods of noise reduction
- artificial intelligence and machine learning based analysis and prediction for univariate and multivariate time series
Contributions on methodological developments and applications to problems across all geoscientific disciplines are equally encouraged.
Spatio-temporal data science: theoretical advances and applications in computational geosciences
Most of the processes studied by geoscientists are characterized by variations in both space and time. These spatio-temporal phenomena have been traditionally investigated using linear statistical approaches, as in the case of physically-based models and geostatistical models. Additionally, the rising attention toward machine learning, as well as the rapid growth of computational resources, opens new horizons in understanding, modelling and forecasting complex spatio-temporal systems through the use of stochastics non-linear models.
This session aims at exploring the new challenges and opportunities opened by the spread of data-driven statistical learning approaches in Earth and Soil Sciences. We invite cutting-edge contributions related to methods of spatio-temporal geostatistics or data mining on topics that include, but are not limited to:
- advances in spatio-temporal modeling using geostatistics and machine learning;
- uncertainty quantification and representation;
- innovative techniques of knowledge extraction based on clustering, pattern recognition and, more generally, data mining.
The main applications will be closely related to the research in environmental sciences and quantitative geography. A non-complete list of possible applications includes:
- natural and anthropogenic hazards (e.g. floods; landslides; earthquakes; wildfires; soil, water, and air pollution);
- interaction between geosphere and anthroposphere (e.g. land degradation; urban sprawl);
- socio-economic sciences, characterized by the spatial and temporal dimension of the data (e.g. census data; transport; commuter traffic).
Learning from spatial data: unveiling the geo-environment through quantitative approaches
The interactions between geo-environmental and anthropic processes are increasing due to the ever-growing population and its related side effects (e.g., urban sprawl, land degradation, natural resource and energy consumption, etc.). Natural hazards, land degradation and environmental pollution are three of the possible “interactions” between geosphere and anthroposphere. In this context, spatial and spatiotemporal data are of crucial importance for the identification, analysis and modelling of the processes of interest in Earth and Soil Sciences. The information content of such geo-environmental data requires advanced mathematical, statistical and geomorphometric methodologies in order to be fully exploited.
The session aims to explore the challenges and potentialities of quantitative spatial data analysis and modelling in the context of Earth and Soil Sciences, with a special focus on geo-environmental challenges. Studies implementing intuitive and applied mathematical/numerical approaches and highlighting their key potentialities and limitations are particularly sought after. A special attention is paid to spatial uncertainty evaluation and its possible reduction, and to alternative techniques of representation of spatial data (e.g., visualization, sonification, haptic devices, etc.).
In the session, two main topics will be covered (although the session is not limited to them!):
1) Analysis of sparse (fragmentary) spatial data for mapping purposes with evaluation of spatial uncertainty: geostatistics, machine learning, statistical learning, etc.
2) Analysis and representation of exhaustive spatial data at different scales and resolutions: geomorphometry, image analysis, machine learning, pattern recognition, etc.
Management and integration of environmental observation data
Together with the rapid development of sensor technologies and the implementation of environmental observation networks (e.g. MOSES, TERENO, Digital Earth, eLTER, CUAHSI, ICOS, ENOHA,…) a large number of data infrastructures are being created to manage and provide access to observation data. However, significant advances in earth system understanding can only be achieved through better and easier integration of data from distributed infrastructures. In particular, the development of methods for the automatic real-time processing and integration of observation data in models is required in many applications. The automatic meaningful integration of these data sets is often hindered due to semantic and structural differences between data and poor metadata quality. Improvement in this field strongly depends on the capabilities of dealing with fast growing multi-parameter data and on effort employing data science methods, adapting new algorithms and developing digital workflows tailored to specific scientific needs. Automated quality assessment/control algorithms, data discovery and exploration tools, standardized interfaces and vocabularies as well as data and processing exchange strategies and security concepts are required to interconnecting distributed data infrastructures. Besides the technical integration, also the meaningful integration for different spatial and temporal support or measurement scales is an important aspect. This session focuses on the specific requirements, techniques and solutions to process, provide and couple observation data from (distributed) infrastructures and to make observation data available for modelling and other scientific needs.
16:25–16:29: MOSAiC goes O2A - Arctic Expedition Data Flow from Observations to Archives
16:29–16:33: Implementing a new data acquisition system for the advanced integrated atmospheric observation system KITcube
16:33–16:37: Implementing FAIR principles for dissemination of data from the French OZCAR Critical Observatory network: the Theia/OZCAR information system
16:47–16:51: Solutions for providing web-accessible, semi-standardised ecosystem research site information
16:51–16:55: Put your models in the web - less painful
16:55–16:59: Improving future optical Earth Observation products using transfer learning
16:59–17:03: Design and Development of Interoperable Cloud Sensor Services to Support Citizen Science Projects
17:13–17:17: Providing a user-friendly outlier analysis service implemented as open REST API
17:17–17:21: Graph-based river network analysis for rapid discovery and analysis of linked hydrological data
17:21–17:25: SIMILE: An integrated monitoring system to understand, protect and manage sub-alpine lakes and their ecosystem
Advances in geomorphometry and landform mapping: possibilities, challenges and perspectives
Geomorphometry and geomorphological mapping are important tools used for understanding landscape processes and dynamics on Earth and other planetary bodies. Recent rapid growth of technology and advances in data collection methods has made available vast quantities of geospatial data for such morphometric analysis and mapping, with the geospatial data offering unprecedented spatio-temporal range, density, and resolution. This explosion in the availability of geospatial data opens up considerable possibilities for morphometric analysis and mapping (e.g. for recognising new landforms and processes), but it also presents new challenges in terms of data processing and analysis.
This inter-disciplinary session on geomorphometry and landform mapping aims to bridge the gap between process-focused research fields and the technical domain where geospatial products and analytical methods are developed. The increasing availability of a wide range of geospatial datasets requires the continued development of new tools and analytical approaches as well as landform/landscape classifications. However, a potential lack of communication across disciplines results in efforts to be mainly focused on problems within individual fields. We aim to foster collaboration and the sharing of ideas across subject-boundaries, between technique developers and users, enabling us as a community to fully exploit the wealth of geospatial data that is now available.
We welcome perspectives on geomorphometry and landform mapping from ANY discipline (e.g. geomorphology, planetary science, natural hazard assessment, computer science, remote sensing). This session aims to showcase both technical and applied studies, and we welcome contributions that present (a) new techniques for collecting or deriving geospatial data products, (b) novel tools for analysing geospatial data and extracting innovative geomorphometric variables, (c) mapping and/or morphometric analysis of specific landforms as well as whole landscapes, and (d) mapping and/or morphometric analysis of newly available geospatial datasets. Contributions that demonstrate multi-method or inter-disciplinary approaches are particularly encouraged. We also actively encourage contributors to present tools/methods that are “in development”.
Deep Learning for Geosciences with MATLAB made easy
This short course will focus on modern, data driven analytical methods in the field of Deep Learning with MATLAB. Deep Learning represents powerful artificial intelligence tools used to solve complex modeling problems in earth and ocean sciences, planetary and atmospheric sciences, and related math and geoscience fields. The MATLAB based Deep Learning platform provides algorithms and tools for creating and training deep neural networks. These networks are used to simulate processes of past, present and future environmental events in this wide range of disciplines.
Participants will be able to adopt concepts of Deep Learning for their areas of research such as dynamics, preconditions, and trends related to the surface, subsurface and the atmosphere of the planets. The content level will be 80% beginner, 10% intermediate, and 10% advanced. Scientists from all disciplines are invited to participate in this course. Any previous experience with Deep Learning and distributed computing will be beneficial but not necessary for participation.
The maximum number of participants is 65, in order to guarantee direct supervision for the hands-on part of the session.
The seminar will take place on Wed, 13 May, 10:30-12:00 CEST. Register at: