Our global societies are facing many complex and interlinked challenges such as climate change, sea-level rise, water and food security, uncontrolled spread of infectious diseases or finding tools for sustainable development of our dwindling mineral and petroleum resources. Environmental and Earth system sciences have a significant role to play in these challenges but will require the integration of scientific data, software and tools from multiple, globally distributed resources to unlock their potential to contribute. The preconditions for interdisciplinary research are set by existing national- and continental-scale research infrastructures and e-infrastructures (e.g., EOSC, ENVRI, EPOS, EarthCube, IRIS, UNAVCO, AuScope, etc.). We now need to foster their convergence and develop innovative and FAIR data and software, as well as integrated services to enhance the efficiency and productivity of researchers as we scale up to more complex challenges upcoming. Thereby, some problems will require new solutions such as next-generation computing at exascale.
This session solicits papers from different fields of expertise in the Environmental and Earth system domain (research and e-infrastructures, repositories and data hubs, interdisciplinary data users, global initiatives etc.), who are working to support tackling the existing and upcoming challenges. We also invite papers from those who are working towards the next generation infrastructures who can point up the practical challenges, perspectives, and potential solutions related to creating an open and collaborative ecosystem of research and e-Infrastructures that will support the next phase of Environmental and Earth system science research at exascale.

Public information:
(solicited presenter: Alice-Agnes Gabriel, gabriel@geophysik.uni-muenchen.de)

Co-sponsored by AGU
Convener: Daniela FranzECSECS | Co-conveners: Ari Asmi, Helen Glaves, Lesley Wyborn
| Attendance Tue, 05 May, 10:45–12:30 (CEST)

Files for download

Download all presentations (96MB)

Chat time: Tuesday, 5 May 2020, 10:45–12:30

Chairperson: Lesley Wyborn, Ari Asmi
D901 |
| solicited
| Highlight
Arnau Folch, Josep de la Puente, Laura Sandri, Benedikt Halldórsson, Andreas Fichtner, Jose Gracia, Piero Lanucara, Michael Bader, Alice-Agnes Gabriel, Jorge Macías, Finn Lovholt, Alexandre Fournier, Vadim Monteiller, and Soline Laforet

The Center of Excellence for Exascale in Solid Earth (ChEESE; https://cheese-coe.eu) is promoting the use of upcoming Exascale and extreme performance computing capabilities in the area of Solid Earth by harnessing institutions in charge of operational monitoring networks, tier-0 supercomputing centers, academia, hardware developers and third parties from SMEs, Industry and public-governance. The scientific challenging ambition is to prepare 10 European open-source flagship codes to solve Exascale problems on computational seismology, magnetohydrodynamics, physical volcanology, tsunamis, and data analysis. Preparation to Exascale is considering code inter-kernel aspects of simulation workflows like data management and sharing following the FAIR principles, I/O, post-process and visualization. The project is articulated around 12 Pilot Demonstrators (PDs) in which flagship codes are used for near real-time seismic simulations and full-wave inversion, ensemble-based volcanic ash dispersal forecasts, faster than real-time tsunami simulations and physics-based hazard assessments for earthquakes, volcanoes and tsunamis. This is a first step towards enabling of operational e-services requiring of extreme HPC on urgent computing, early warning forecast of geohazards, hazard assessment and data analytics. Additionally, and in collaboration with the European Plate Observing System (EPOS), ChEESE will promote and facilitate the integration of HPC services to widen the access to codes and fostering transfer of know-how to Solid Earth user communities. In this regard, the project aims at acting as a hub to foster HPC across the Solid Earth Community and related stakeholders and to provide specialized training on services and capacity building measures.

How to cite: Folch, A., de la Puente, J., Sandri, L., Halldórsson, B., Fichtner, A., Gracia, J., Lanucara, P., Bader, M., Gabriel, A.-A., Macías, J., Lovholt, F., Fournier, A., Monteiller, V., and Laforet, S.: e-infrastructures and natural hazards. The Center of Excellence for Exascale in Solid Earth (ChEESE), EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-13497, https://doi.org/10.5194/egusphere-egu2020-13497, 2020.

D902 |
Dietmar Backes, Norman Teferle, and Guy Schumann

In remote sensing, benchmark and CalVal datasets are routinely provided by learned societies and professional organisations such as the Committee for Earth Observation Satellites (CEOS), European Spatial Data Research (EuroSDR) and International Societies for Photogrammetry and Remote Sensing (ISPRS). Initiatives are often created to serve specific research needs. Many valuable datasets disappear after the conclusion of such projects even though the original data or the results of these investigations might have significant value to other scientific communities that might not have been aware of the projects. Initiatives such as FAIR data (Findable, Accessible, Interoperable, Re-usable) or the European Open Science Cloud (EOSC) aim to overcome this situation and preserve scientific data sets for wider scientific communities.

Motivated by increased public interest following the emerging effects of climate change on local weather and rainfall patterns, the field of urban flood hazard modelling has developped rapidly in recent years. New sensors and platforms are able to provide high-resolution topographic data from highly agile Earth Observation (EO) satellites to small low-altitude drones or terrestrial mobile mapping systems. The question arises as to which type of topographic information is most suitable for realistic and accurate urban flood modelling and are current methodologies able to exploit the increased level of detail contained in such data? 

In the presented project, we aim to assemble a topographic research data repository to provide multimodal 3D datasets to optimise and benchmark urban flood modelling. The test site chosen is located in the South of Luxembourg in the municipality of Dudelange, which provides a typical European landscape with rolling hills, urban, agricultural but also re naturalised areas over a local stream catchment. The region has been affected by flash flooding following heavy rainfall events in the past.

The assembled datasets were derived from LiDAR and photogrammetric methodologies and consist of topographic surface representation ranging from medium resolutions DEMs with 10m GSD to highly dense point clouds derived from drone photogrammetry. The data were collected from spaceborne, traditional airborne, low-altitude drone as well as terrestrial platforms. The datasets are well documented with adequate meta information to describe their origin, currency, quality and accuracy. Raw data is provided where intellectual property rights permit the dissemination. Terrain models and point clouds are generally cleaned for blunders using standard methods and manual inspection. However, elaborate cleaning and filtering should be done by the investigators to allow the optimisation towards the requirements of their methodologies. Additional value-added terrain representations e.g. generated through data fusion approaches are also provided.

It is the intention of the project team to create a ‘living data set’ following the FAIR data principles. The expensive and comprehensive data set collected for flood hazard mapping could also be valuable to other scientific communities. Results of ongoing work should be integrated, and newly collected data layers will keep the research repository relevant and UpToDate. Sharing this well-maintained dataset amongst any interested research community will maximize its value.

How to cite: Backes, D., Teferle, N., and Schumann, G.: Building a Multimodal topographic dataset for flood hazard modelling and other geoscience applications, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-18279, https://doi.org/10.5194/egusphere-egu2020-18279, 2020.

D903 |
James Riley, Charles Meertens, David Mencin, Kathleen Hodgkinson, Douglas Ertz, David Maggert, Dan Reiner, Christopher Crosby, and Scott Baker

The U.S. National Science Foundation’s Geodesy Advancing Geosciences (GAGE) Facility, operated by UNAVCO, is building systems and adopting practices in support of more comprehensive data discovery, search and access capabilities across our various geodetic data holdings. As a World Data Center, the GAGE Facility recognizes the need for interoperability of its Earth data holdings in its archives, as represented by the FAIR Data Principles. To this end, web services, both as back-end and front-end resources, are being developed to provide new and enhanced capabilities. 

UNAVCO is exploring international standards such as ISO Geographic information Metadata and the Open Geospatial Consortium’s (OGC) web services that have been in development for decades to help facilitate interoperability. Through various collaborations, UNAVCO seeks to develop and promote infrastructure, metadata, and interoperability standards for the community. We are participating in the development of the next version of GeodesyML, being led by Geoscience Australia, which will leverage standards and help codify metadata practices for the geodetic community. New web technologies like Linked Data, are arising to augment these standards and provide greater connectivity and interoperability of structured data and UNAVCO has implemented Schema.org metadata for its datasets and partnered with EarthCube’s Project 418/419 and Google Dataset Search. Persistent identifiers are being adopted with DOI’s for datasets and exploration into RORs for organizational affiliation, and ORCID iDs for identity and access management and usage metrics are being explored. As UNAVCO investigates these various technologies and practices, they remain in various states of acceptance and implementation, we share our experiences to date.

How to cite: Riley, J., Meertens, C., Mencin, D., Hodgkinson, K., Ertz, D., Maggert, D., Reiner, D., Crosby, C., and Baker, S.: GAGE Facility Geodetic Data Archive: Discoverability, Accessibility, Interoperability & Attribution, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-12417, https://doi.org/10.5194/egusphere-egu2020-12417, 2020.

D904 |
Putting Data to Work: ESIP and EarthCube working together to transform geoscience
Erin Robinson
D905 |
Jerry A Carter, Charles Meertens, Chad Trabant, and James Riley

One of the fundamental tenets of the Incorporated Research Institutions for Seismology’s (IRIS’s) mission is to “Promote exchange of seismic and other geophysical data … through pursuing policies of free and unrestricted data access.”  UNAVCO also adheres to a data policy that promotes free and unrestricted use of data.  A major outcome of these policies has been to reduce the time that researchers spend finding, obtaining, and reformatting data.  While rapid, easy access to large archives of data has been successfully achieved in seismology, geodesy and many other distinct disciplines, integrating different data types in a converged data center that promotes interdisciplinary research remains a challenge.  This challenge will be addressed in an integrated seismological and geodetic data services facility that is being mandated by the National Science Foundation (NSF).  NSF’s Seismological Facility for the Advancement of Geoscience (SAGE), which is managed by IRIS, will be integrated with NSF’s Geodetic Facility for the Advancement of Geoscience (GAGE), which is managed by UNAVCO.  The combined data services portion of the facility, for which a prototype will be developed over the next two to three years, will host a number of different data types including seismic, GNSS, magnetotelluric, SAR, infrasonic, hydroacoustic, and many others.  Although IRIS and UNAVCO have worked closely for many years on mutually beneficial projects and have shared their experience with each other, combining the seismic and geodetic data services presents challenges to the well-functioning SAGE and GAGE data facilities that have served their respective scientific communities for more than 30 years. This presentation describes some preliminary thoughts and guiding principles to ensure that we build upon the demonstrated success of both facilities and how an integrated GAGE and SAGE data services facility might address the challenges of fostering interdisciplinary research. 

How to cite: Carter, J. A., Meertens, C., Trabant, C., and Riley, J.: Converging Seismic and Geodetic Data Services, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-12718, https://doi.org/10.5194/egusphere-egu2020-12718, 2020.

D906 |
Tim Rawling

AuScope is the national provider of research infrastructure to the earth and geospatial sciences communities in Australia. Funded through the NCRIS scheme since 2006 we have invested heavily in a diverse suite of infrastructures in that time, from VLBI telescopes to geochronology laboratories, and national geophysical data acquisitions to development of numerical simulation and inversion codes.

Each of these programs, and the communities they support have different requirements relating to data structures, data storage, compute and access and as a result there has been a tendency in the past to build bespoke discipline specific data systems.  This approach limits the opportunities for cross domain research activity and investigation.

AuScope recently released our plans to build an Australian Downward Looking Telescope (or DLT).  This will be a distributed observational, characterisation and computational infrastructure providing the capability for Australian geoscientists to image and understand the composition of the Australian Plate with unprecedented fidelity.

The recent development of an investment plan for the construction of this National Research Infrastructure has allowed our community to reassess existing data deliver strategies and architectures to bring them in line with current international best practice.

Here we present the proposed e-infrastructure that will underpin the DLT.  This FAIR data platform will facilitate open and convergent research across the geosciences and will underpin efforts currently underway to connect international research infrastructures, including EPOS, AuScope and IRIS and others, to create a global research infrastructure network for earth science.

How to cite: Rawling, T.: Geoscience data interoperability through a new lens: how designing a telescope that looks down changed our view of data., EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-12673, https://doi.org/10.5194/egusphere-egu2020-12673, 2020.

D907 |
Philip Kershaw, Ghaleb Abdulla, Sasha Ames, Ben Evans, Tom Landry, Michael Lautenschlager, Venkatramani Balaji, and Guillaume Levavasseur

The Earth System Grid Federation (ESGF) is a globally distributed e-infrastructure for the hosting and dissemination of climate-related data.  ESGF was originally developed to support the community in the analysis of CMIP5 (5th Coupled Model Intercomparison Project) data in support of the 5th Assessment report made by the IPCC (Intergovernmental Panel on Climate Change).  Recognising the challenge of the large volumes of data concerned and the international nature of the work, a federated system was developed linking together a network of collaborating data providers around the world. This enables users to discover, download and access data through a single unified system such that they can seamlessly pull data from these multiple hosting centres via a common set of APIs.  ESGF has grown to support over 16000 registered users and besides the CMIPs, supports a range of other projects such as the Energy Exascale Earth System Model, Obs4MIPS, CORDEX and the European Space Agency’s Climate Change Initiative Open Data Portal.

Over the course of its evolution, ESGF has pioneered technologies and operational practice for distributed systems including solutions for federated search, metadata modelling and capture, identity management and large scale replication of data.  Now in its tenth year of operation, a major review of the system architecture is underway. For this next generation system, we will be drawing from our experience and lessons learnt running an operational e-infrastructure but also considering other similar systems and initiatives.  These include for example, ESA’s Earth Observation Exploitation Platform Common Architecture, outputs from recent OGC Testbeds and Pangeo (https://pangeo.io/), a community and software stack for the geosciences.   Drawing from our own recent pilot work, we look at the role of cloud computing with its impact on deployment practice and hosting architecture but also new paradigms for massively parallel data storage and access, such as object store. The cloud also offers a potential point of entry for scientists without access to large-scale computing, analysis, and network resources.  As trusted international repositories, the major national computing centres that host and replicate large corpuses of ESGF have increasingly been supporting a broader range of domains and communities in the Earth sciences. We explore the critical role of standards for connecting data and the application of FAIR data principles to ensure free and open access and interoperability with other similar systems in the Earth Sciences.

How to cite: Kershaw, P., Abdulla, G., Ames, S., Evans, B., Landry, T., Lautenschlager, M., Balaji, V., and Levavasseur, G.: Evolution and Future Architecture for the Earth System Grid Federation, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-18310, https://doi.org/10.5194/egusphere-egu2020-18310, 2020.

D908 |
Mohan Ramamurthy

The geoscience disciplines are either gathering or generating data in ever-increasing volumes. To ensure that the science community and society reap the utmost benefits in research and societal applications from such rich and diverse data resources, there is a growing interest in broad-scale, open data sharing to foster myriad scientific endeavors. However, open access to data is not sufficient; research outputs must be reusable and reproducible to accelerate scientific discovery and catalyze innovation.

As part of its mission, Unidata, a geoscience cyberinfrastructure facility, has been developing and deploying data infrastructure and data-proximate scientific workflows and analysis tools using cloud computing technologies for accessing, analyzing, and visualizing geoscience data.

Specifically, Unidata has developed techniques that combine robust access to well-documented datasets with easy-to-use tools, using workflow technologies. In addition to fostering the adoption of technologies like pre-configured virtual machines through Docker containers and Jupyter notebooks, other computational and analytic methods are enabled via “Software as a Service” and “Data as a Service” techniques with the deployment of the Cloud IDV, AWIPS Servers, and the THREDDS Data Server in the cloud. The collective impact of these services and tools is to enable scientists to use the Unidata Science Gateway capabilities to not only conduct their research but also share and collaborate with other researchers and advance the intertwined goals of Reproducibility of Science and Open Science, and in the process, truly enabling “Science as a Service”.

Unidata has implemented the aforementioned services on the Unidata Science Gateway ((http://science-gateway.unidata.ucar.edu), which is hosted on the Jetstream cloud, a cloud-computing facility that is funded by the U. S. National Science Foundation. The aim is to give geoscientists an ecosystem that includes data, tools, models, workflows, and workspaces for collaboration and sharing of resources.

In this presentation, we will discuss our work to date in developing the Unidata Science Gateway and the hosted services therein, as well as our future directions toward increasing expectations from funders and scientific communities that they will be Open and FAIR (Findable, Accessible, Interoperable, Reusable). In particular, we will discuss how Unidata is advancing data and software transparency, open science, and reproducible research. We will share our experiences in how the geoscience and information science communities are using the data, tools and services provided through the Unidata Science Gateway to advance research and education in the geosciences.

How to cite: Ramamurthy, M.: A Cloud-based Science Gateway for Enabling Science as a Service to Facilitate Open Science and Reproducible Research, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-10761, https://doi.org/10.5194/egusphere-egu2020-10761, 2020.

D909 |
XiaoFeng Liao, Doron Goldfarb, Barbara Magagna, Markus Stocker, Peter Thijsse, Dick Schaap, and Zhiming Zhao

The Horizon 2020 ENVRI-FAIR project brings together 14 European environmental research infrastructures (ENVRI) to develop solutions to improve the FAIRness of their data and services, and eventually to connect the ENVRI community with the European Open Science Cloud (EOSC). It is thus essential to share the reusable solutions while RIs are tackling common challenges in improving their FAIRness, and to continually assess the FAIRness of ENVRI (meta)data services as they are developed. 
The FAIRness assessment is, however, far from trivial. On the one hand, the task relies on gathering the required information from RIs, e.g. information about the metadata and data repositories operated by RIs, the kind of metadata standards repositories implement, the use of persistent identifier systems. Such information is gathered using questionnaires whose processing can be time-consuming. On the other hand, to enable efficient querying, processing and analysis, the information needs to be machine-actionable and curated in a knowledge base.
Besides acting as a general resource to learn about RIs, the ENVRI knowledge base (KB) supports RI managers in identifying current gaps in their RI’s implementation of the FAIR Data Principles. For instance, a RI manager can interrogate the KB to discover whether a data repository of the RI uses a persistent identifier service or if the repository is certified according to some scheme. Having identified a gap, the KB can support the RI manager in exploring the solutions implemented by other RIs.
By linking questionnaire information to training resources, the KB also supports the discovery of materials that provide hands-on demonstrations for how state-of-the-art technologies can be used and implemented to address FAIR requirements. For instance, if a RI manager discovers that the metadata of one of the RI’s repositories does not include machine-readable provenance, the ENVRI KB can inform the manager about available training material demonstrating how the PROV Ontology can be used to implement machine-readable provenance in systems. Such demonstrators can be highly actionable as they can be implemented in Jupyter and executed with services such as mybinder. Thus, the KB can seamlessly integrate the state of FAIR implementation in RIs with actionable training material and is therefore a resource that is expected to contribute substantially to improving ENVRI FAIRness.
The ENVRI KB is implemented using the W3C Recommendations developed within the Semantic Web Activity, specifically RDF, OWL, and SPARQL. To effectively expose its content to RI communities, ranging from scientists to managers, and other stakeholders, the ENVRI-FAIR KB will need a customisable user interface for context-aware information discovery, visualisation, and content update. The current prototype can be accessed: kb.oil-e.net. 

How to cite: Liao, X., Goldfarb, D., Magagna, B., Stocker, M., Thijsse, P., Schaap, D., and Zhao, Z.: ENVRI knowledge base: A community knowledge base for research, innovation and society, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-20708, https://doi.org/10.5194/egusphere-egu2020-20708, 2020.

D910 |
Carmela Freda, Rossana Paciello, Jan Michalek, Kuvvet Atakan, Daniele Bailo, Keith Jeffery, Matt Harrison, Massimo Cocco, and Epos Team

The European Plate Observing System (EPOS) addresses the problem of homogeneous access to heterogeneous digital assets in geoscience of the European tectonic plate. Such access opens new research opportunities. Previous attempts have been limited in scope and required much human intervention. EPOS adopts an advanced Information and Communication Technologies (ICT) architecture driven by a catalog of rich metadata. The architecture together with challenges and solutions adopted are presented. The EPOS ICS Data Portal is introducing a new way for cross-disciplinary research. The multidisciplinary research is raising new possibilities for both students and teachers. The EPOS portal can be used either to explore the available datasets or to facilitate the research itself. It can be very instructive in teaching as well, for example by demonstrating scientific use cases. 

EPOS is a European project about building a pan-European infrastructure for accessing solid Earth science data. The finished EPOS-IP project includes 47 partners plus 6 associate partners from 25 countries from all over Europe and several international organizations. However, the community contributing to the EPOS integration plan is larger than the official partnership of EPOS-IP project, because more countries are represented by the international organizations and because there are several research institutions involved within each country.

The recently developed EPOS ICS Data Portal provides access to data and data products from ten different geoscientific areas: Seismology, Near Fault Observatories, GNSS Data and Products, Volcano Observations, Satellite Data, Geomagnetic Observations, Anthropogenic Hazards, Geological Information and Modelling, Multi-scale laboratories and Geo-Energy Test Beds for Low Carbon Energy.

The presentation focusses on the EPOS ICS Data Portal, which is providing information about available datasets from TCS and access to them. We are demonstrating not only features of the graphical user interface but also the underlying architecture of the whole system.

How to cite: Freda, C., Paciello, R., Michalek, J., Atakan, K., Bailo, D., Jeffery, K., Harrison, M., Cocco, M., and Team, E.: EPOS ICS Data Portal, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-19050, https://doi.org/10.5194/egusphere-egu2020-19050, 2020.

D911 |
Jan Michalek, Kuvvet Atakan, Christian Rønnevik, Tor Langeland, Ove Daae Lampe, Gro Fonnes, Svein Mykkeltveit, Jon Magnus Christensen, Ulf Baadshaug, Halfdan Pascal Kierulf, Bjørn-Ove Grøtan, and Odleiv Olesen

The European Plate Observing System (EPOS) is a European project about building a pan-European infrastructure for accessing solid Earth science data, governed now by EPOS ERIC (European Research Infrastructure Consortium). The EPOS-Norway project (EPOS-N; RCN-Infrastructure Programme - Project no. 245763) is a Norwegian project funded by National Research Council. The aims of EPOS-N project are divided into four work packages where one of them is about integrating Norwegian geoscientific data into an e-infrastructure. The other three work packages are: management of the project, improving the geoscientific monitoring in the Arctic and establishing Solid Earth Science Forum to communicate the progress within the geoscientific community and also providing feedback to the development group of the e-infrastructure.

Among the six EPOS-N project partners, five institutions are actively participating and providing data in the EPOS-N project – University of Bergen (UIB), University of Oslo (UIO), Norwegian Mapping Authority (NMA), Geological Survey of Norway (NGU) and NORSAR. The data which are about to be integrated are divided into categories according to the thematic fields – seismology, geodesy, geological maps and geophysical data. Before the data can be integrated into the e-infrastructure their formats need to follow the international standards which were already developed by the communities of geoscientists around the world. Metadata are stored in Granularity Database tool and easily accessible by other tools via dedicated API. For now, there are 33 Data, Data Products, Software and Services (DDSS) described in EPOS-N list.     

We present the Norwegian approach of integration of the geoscientific data into the e-infrastructure, closely following the EPOS ERIC development. The sixth partner in the project – NORCE Norwegian Research Centre AS is specialized in visualizations of data and developing the EPOS-N Portal. It is web-based graphical user interface adopting Enlighten-web software which allows users to visualize and analyze cross-disciplinary data. Expert users can launch the visualization software through a web based programming interface (Jupyter Notebook) for processing of the data. The seismological waveform data (provided by UIB and NORSAR) will be available through an EIDA system, seismological data products (receiver functions, earthquake catalogues and macroseismic observations) as individual datasets or through a web service, GNSS data products (provided by NMA) through standalone files and geological and geophysical (magnetic, gravity anomaly) maps (provided by NGU) as WMS web services or standalone files. Integration of some specific geophysical data is still under discussion, such as georeferenced cross-sections which are of interest especially for visualization with other geoscientific data.     

Constant user feedback is achieved through dedicated workshops. Various use cases are defined by users and have been tested in these workshops. Collected feedback is being used for further development and improvements of the EPOS-N Portal.

How to cite: Michalek, J., Atakan, K., Rønnevik, C., Langeland, T., Lampe, O. D., Fonnes, G., Mykkeltveit, S., Magnus Christensen, J., Baadshaug, U., Kierulf, H. P., Grøtan, B.-O., and Olesen, O.: EPOS-Norway – Integration of Norwegian geoscientific data into a common e-infrastucture, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-18842, https://doi.org/10.5194/egusphere-egu2020-18842, 2020.

D912 |
Nikolay Miloshev, Petya Trifonova, Ivan Georgiev, Tania Marinova, Nikolay Dobrev, Violeta Slabakova, Velichka Milusheva, and Todor Gurov

The National Geo-Information Center (NGIC) is a distributed research infrastructure funded by the National road map for scientific infrastructure (2017-2023) of Bulgaria. It operates in a variety of disciplines such as geophysics, geology, seismology, geodesy, oceanology, climatology, soil science, etc. providing data products and services. Created as a partnership between four institutes working in the field of Earth observation: the National Institute of Geophysics, Geodesy and Geography (NIGGG), the National Institute of Meteorology and Hydrology (NIMH), the Institute of Oceanology (IO), the Geological Institute (GI), and two institutes competent in ICT: the Institute of Mathematics and Informatics (IMI) and the Institute of Information and Communication Technologies (IICT), NGIC consortium serve as primary community of data collectors for national geoscience research. Besides the science, NGIC aims to support decision makers during the process of prevention and protection of the population from natural and anthropogenic risks and disasters.

Individual NGIC partners originated independently and differ from one another in management and disciplinary scope. Thus, the conceptual model of the NGIC system architecture is based on a federated model structure in which the partners retain their independence and contribute to the development of the common infrastructure through the data and research they carry out. The basic conceptual model of architecture uses both service and microservice concepts and may be altered according to the specifics of the organization environment and development goals of the NGIC information system. It consists of three layers: “Sources” layer containing the providers of Data, Data products, Services and Soft-ware (DDSS), “Interoperability”- regulating the access, automation of discovery and selection of DDSS and data collection from the sources, and “Integration” layer which produces integrated data products.

The diversity of NGIC’s data, data products, and services is a major strength and of high value to its users like governmental institutions and agencies, research organizations and universities, private sector enterprises, media and the public. NGIC will pursue collaboration with initiatives, projects and research infrastructures for Earth observation to enhance access to an integrated global data resource.

How to cite: Miloshev, N., Trifonova, P., Georgiev, I., Marinova, T., Dobrev, N., Slabakova, V., Milusheva, V., and Gurov, T.: NGIC: turning concepts into reality, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-7627, https://doi.org/10.5194/egusphere-egu2020-7627, 2020.

D913 |
Chris Atherton, Peter Löwe, and Torsten Heinen

We face unprecedented environmental challenges as a species, that threaten our existing way of life.  We are still learning to understand our planet, although we have a good idea how it works.  The speed of research needs to accelerate to provide information to decision makers, to better respond to our societal challenges.  To do this we need to move towards leveraging large datasets to speed up research, as proposed by Jim Grey in ‘The Fourth Paradigm’. In the world of research infrastructures we need to provide a means for scientists to access vast amounts of research data from multiple data sources in an easy and efficient way.  EOSC is addressing this but we are only scratching the surface when it comes to unleashing the full potential of the scientific community.  Datacubes have recently emerged as a technology in the Environmental and Earth system domain to store imagery data in a way that makes it easier and quicker for scientists to perform their research.  But with the scales of data volumes that are being considered, there are many challenges to curating, hosting, and funding this information in a centralised centre.  Our proposal seeks to leverage the existing National Research and Education (NRENs) infrastructures to store national repositories of regional Environmental and Earth system domain data, for this to be shared with scientists in an open, federated but secure way, conforming to FAIR principles.  This would provide levels of redundancy, data sovereignty and scalability for hosting global environmental datasets in an exascale world.

How to cite: Atherton, C., Löwe, P., and Heinen, T.: Federated and intelligent datacubes, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-22592, https://doi.org/10.5194/egusphere-egu2020-22592, 2020.

D914 |
Peter Löwe, Tobias Gebel, and Hans Walter Steinhauer

Due to the european INSPIRE directive to establish an infrastructure for spatial information in Europe, the number of national data sources in Europe which are open to the public or at least science continues to grow. 

However, challenges remain to enable easy access for society and science  to these previously unavailable data silos based on standardized web-services, as defined by the Open Geospatial Consortium (OGC). This is crucial to ensure sustainable data generation and reuse according to the FAIR principles (Findable, Accessible, Interoperable, Reusable). 

We report on an interdisciplinary application, using spatial data to improve longitudinal surveys in the social sciences, involving building plans encoded in CityGML, PostGIS, MapServer and R.

The Socio-economic Panel (SOEP) as part of the German Institute for Economic Research (DIW Berlin) provides longitudinal data on persons living in private households across Germany. Lately the SOEP sampled households in certain neighborhoods within cities, areas of the so called „Soziale Stadt“ (social town). Because of restricted area, spatially referenced data has been used. Information on the level of census tiles provided by the Federal Statistical Office was used to form regional clusters. 

Within these clusters addresses, spatially referenced by the German Federal Agency for Cartography and Geodesy (BKG), have been sampled. This way, we made sure addresses are within the neighborhoods to be surveyed. As this procedure turned out to reduce organizational burden for the survey research institute as well as for the interviewers and at the same time allows for generating random household samples, it is considered for future use. Yet, addresses can belong to residential buildings as well as cinemas or hotels. 

To meet with this obstacle we evaluate the use of 3D Building Models provided by the German Federal Agency for Cartography and Geodesy (BKG).
This data is distributed as compressed data archives for the 16 states of Germany, each containing very large numbers of CityGML files containing  LoD1 data sets for buildings. The large storage footprint of these data sets makes their reuse by social scientists using standard  statistical software (such as R or Stata) on desktop computers difficult at best. This is overcome by providing flexible access to Areas of Interest (AOI) through OGC Webservices (WMS/WFS) based on a PostGIS database. The ingestion process is based on the new GMLAS driver of the ogr software project for Complex Features encoded in Geographic Markup Language (GML) based on application schemas.

How to cite: Löwe, P., Gebel, T., and Steinhauer, H. W.: From Silos to FAIR Services: Interoperable application of geospatial data for longitudinal surveys in the Social Sciences., EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-18956, https://doi.org/10.5194/egusphere-egu2020-18956, 2020.

D915 |
Martin Juckes, Anna Pirani, Charlotte Pascoe, Robin Matthews, Martina Stockhause, Bob Chen, and Xing Xiaoshi

The Assessment Reports of the Intergovernmental Panel on Climate Change (IPCC) have provided the scientific basis underpinning far reaching policy decisions.The reports also have a huge influence on public debate about climate change. The IPCC is not responsible either for the evaluation of climate data and related emissions and socioeconomic data and scenarios or for the provision of advice on policy (reports must be “neutral, policy-relevant but not policy-prescriptive”). These omissions may appear unreasonable at first sight, but they are part of the well-tested structure which enables the creation of authoritative reports on the complex and sensitive subject of climate change. The responsibility for evaluation of climate data and related emissions and socioeconomic data and scenarios remains with the global scientific community. The IPCC has the task of undertaking an expert, objective assessment of the state of scientific knowledge as expressed in the scientific literature. The exclusion of responsibility for providing policy advice from the IPCC remit allows the IPCC to stay clear of discussions of political priorities. 

These distinctions and limitations influence the way in which the findable, accessible, interoperable, and reusable (FAIR) data principles are applied to the work of the IPCC Assessment. There are hundreds of figures in the IPCC Assessment Reports, showing line graphs, global or regional maps, and many other displays of data and information. These figures are put together by the authors using data resources which are described in the scientific literature that is being assessed. The figures are there to illustrate or clarify points raised in the text of the assessment. Increasingly, the figures also provide quantitative information which is of critical importance for many individuals and organisations which are seeking to exploit IPCC knowledge. 

This presentation will discuss the process of implementing the FAIR data principles within the IPCC assessment process. It will also review both the value of the FAIR principles to the IPCC authors and the IPCC process and the value of the FAIR data products that the process is expected to generate.

How to cite: Juckes, M., Pirani, A., Pascoe, C., Matthews, R., Stockhause, M., Chen, B., and Xiaoshi, X.: Implementing FAIR Principles in the IPCC Assessment Process, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-10778, https://doi.org/10.5194/egusphere-egu2020-10778, 2020.

D916 |
Paolo Mazzetti, Stefano Nativi, and Changlin Wang

Last September, about 400 delegates gathered in Florence, Italy from all over the world, to attend the 11th International Symposium on Digital Earth (ISDE11). The Opening Plenary session (held in the historic Salone dei Cinquecento at Palazzo Vecchio) included a celebration ceremony for the 20th anniversary of the International Symposium on Digital Earth, which was initiated in Beijing, China in November 1999 by the Chinese Academy of Sciences (CAS).

In the framework of ISDE11, about 30 sessions illustrated the various challenges and opportunities in building a Digital Earth. They included five Grand Debates and Plenary sessions dealing with issues related to: “Trust and Ethics in Digital Earth”; “Digital Earth for United Nations Sustainable Development Goals (SDGs)”; “ISDE in a Transformed Society”; “Challenges and Opportunities of Digital Transformation”; and “New Knowledge Ecosystems.” Moreover, ISDE11 endorsed and approved a new Declaration by the International Society for the Digital Earth (i.e. the 2019 ISDE Florence Declaration) that, after 10 years, lays the path to a new definition of Digital Earth that will be finalized in the first months of 2020.

This presentation will discuss the main outcomes of ISDE11 as well as the future vision of Digital Earth in a Transformed Society.

How to cite: Mazzetti, P., Nativi, S., and Wang, C.: Digital Earth in a Transformed Society, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-5756, https://doi.org/10.5194/egusphere-egu2020-5756, 2020.

D917 |
Stefano Nativi and Max Craglia

The European Commission (EC) puts forward a European approach to artificial intelligence and robotics. It deals with technological, ethical, legal and socio-economic aspects to boost EU's research and industrial capacity and to put AI at the service of European citizens and economy.

Artificial intelligence (AI) has become an area of strategic importance and a key driver of economic development. It can bring solutions to many societal challenges from treating diseases to minimising the environmental impact of farming. However, socio-economic, legal and ethical impacts have to be carefully addressed.

It is essential to join forces in the EU to stay at the forefront of this technological revolution, to ensure competitiveness and to shape the conditions for its development and use (ensuring respect of European values). In this framework, the EC and the Member States published a Coordinated Plan on Artificial Intelligence”, COM(2018) 795, on the development of AI in the EU. The Coordinated Plan includes the recognition of common indicators to monitor AI uptake and development in the Union and the success rate of the strategies in place, with the support of the AI Watch instrument developed by the EC. Therefore, AI Watch is monitoring and assessing European AI landscapes from driving forces to technology developments, from research to market, from data ecosystems to applications. 

The presentation will first introduce the main AI Watch methodology and tasks. Then, it will focus on the interest of AI Watch to monitor and understand what has been the AI impact on Geosciences research and innovation –see for example Climate Change studies. Finally, a proposal to connect EGU Community (in particular ESSI division) and AI Watch will be introduced.

How to cite: Nativi, S. and Craglia, M.: European Commission AI Watch initiative: Artificial Intelligence uptake and the European Geosciences Community, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-5691, https://doi.org/10.5194/egusphere-egu2020-5691, 2020.

D918 |
Stefan Versick, Ole Kirner, Jörg Meyer, Holger Obermaier, and Mehmet Soysal

Earth System Models (ESM) got much more demanding over the last years. Modelled processes got more complex and more and more processes are considered in models. In addition resolutions of the models got higher to improve weather and climate forecasts. This requires faster high performance computers (HPC) and better I/O performance.

Within our Pilot Lab Exascale Earth System Modelling (PL-EESM) we do performance analysis of the ESM EMAC using a standard Lustre file system for output and compare it to the performance using a parallel ad-hoc overlay file system. We will show the impact for two scenarios: one for todays standard amount of output and one with artificial heavy output simulating future ESMs.

An ad-hoc file system is a private parallel file system which is created on-demand for an HPC job using the node-local storage devices, in our case solid-state-disks (SSD). It only exists during the runtime of the job. Therefore output data have to be moved to a permanent file system before the job has finished. Quasi in-situ data analysis and post-processing allows to gain performance as it might result in a decreased amount of data which you have to store - saving disk space and time during the transfer of data to permanent storage. We will show first tests for quasi in-situ post-processing.

How to cite: Versick, S., Kirner, O., Meyer, J., Obermaier, H., and Soysal, M.: Performance gains in an ESM using parallel ad-hoc file systems, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-18121, https://doi.org/10.5194/egusphere-egu2020-18121, 2020.

D919 |
Julien Nosavan, Agathe Moreau, and Steven Hosford

SPOT 1-to-5 satellites have collected more than 30 million images all over the world during the last 30 years from 1986 to 2015 which represents an amazing historical dataset. The SPOT World Heritage (SWH) programme is a CNES initiative to preserve, open and generate positive impact from this SPOT 1-to-5 archive by providing new enhanced products to the general public.

Preservation has been supported for years by archiving raw data (GERALD format) in the CNES long term archive service (STAF) while the commercial market was served by images provided by our commercial partner Airbus. SWH opens a new era with the will to provide and share a new SPOT 1-to-5 archive at image level. The chosen image product is the well-known 1A SCENE product (DIMAP format) which has been one of the SPOT references for years. As a remind, 1A SCENE is a squared 60 km x 60 km GEOTIFF image including initial radiometric corrections from instrument distortions. Image resolution ranges from 20m to 5m depending on the SPOT satellite/instrument (2,5m using SPOT 5 THR on ground processing mode).

This new SWH-1A archive is currently composed of 17 M images which have been first extracted from STAF magnetic tapes over a period of 1 year and processed to 1A level using the standard processing chain on CNES High Processing Center (~432 processing cores). In parallel, additional images acquired by partner receiving stations are being retrieved to ensure that the archive is as exhaustive as possible.

The SPOT 1-to-5 1A archive will be accessible through a dedicated CNES SWH Web catalogue based on REGARDS software which is a CNES Open Source generic tool (GPLv3 license) used to manage data preservation and distribution in line with OAIS (Open Archival Information System) and FAIR (Findable, Accessible, Interoperable, Reusable) paradigms.

Once authenticated and in respect of the SWH license of use, users will then be able to request the catalogue and download products, manually or using APIs supporting OpenSearch requests.

This paper presents the architecture of the whole SPOT preservation process, from processing chains to data distribution with a first introduction to the SWH catalogue.

A last part of the presentation deals with some examples of use cases foreseen using this SPOT dataset.

How to cite: Nosavan, J., Moreau, A., and Hosford, S.: SPOT World Heritage catalogue: 30 years of SPOT 1-to-5 observation, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-8275, https://doi.org/10.5194/egusphere-egu2020-8275, 2020.

D920 |
Lianchong Zhang, Guoqinng Li, Jing Zhao, and Jing Li

Carbon satellite data is an essential part of the greenhouse observation and plays a critical role in global climate change assessment. Existing carbon data analysis e-science platforms are affected by restrictions in distributed resource management and tightly coupled service interoperability. These barriers currently offer no support for facilitating cross-disciplinary exploration and application,which have hindered the development of international cooperation. From 2018, the Cooperation on the Analysis of carbon SAtellites data (CASA), a new international scientific programme, was approved by the Chinese Academy of Sciences (CAS). So far, more than 9 research institutions have been integrated under this cooperation. The result is demonstrated in the global XCO2 dataset based on the Tansat satellite.

How to cite: Zhang, L., Li, G., Zhao, J., and Li, J.: An International Cooperation Practice on the Analysis of Carbon Satellites data, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-12650, https://doi.org/10.5194/egusphere-egu2020-12650, 2020.