EGU21-8458
https://doi.org/10.5194/egusphere-egu21-8458
EGU General Assembly 2021
© Author(s) 2021. This work is distributed under
the Creative Commons Attribution 4.0 License.

The ICOS Carbon Portal as example of a  FAIR community data repository supporting scientific workflows

Alex Vermeulen1,2, Margareta Hellström2, Oleg Mirzov2, Ute Karstens2, Claudio D'Onofrio2, and Harry Lankreijer2
Alex Vermeulen et al.
  • 1ICOS ERIC, Carbon Portal, Lund, Sweden (alex.vermeulen@nateko.lu.se)
  • 2Department of Physical geography and Ecosystem Sciences, Lund University, Lund, Sweden

The Integrated Carbon Observation System (ICOS) provides long term, high quality observations that follow (and cooperatively set) the global standards for the best possible quality data on the atmospheric composition for greenhouse gases (GHG), greenhouse gas exchange fluxes measured by eddy covariance and CO2 partial pressure at water surfaces. The ICOS observational data feeds into a wide area of science that covers for example plant physiology, agriculture, biology, ecology, energy & fuels, forestry, hydrology, (micro)meteorology, environmental, oceanography, geochemistry, physical geography, remote sensing, earth-, climate-, soil- science and combinations of these in multi-disciplinary projects.
As ICOS is committed to provide all data and methods in an open and transparent way as free data, a dedicated system is needed to secure the long term archiving and availability of the data together with the descriptive metadata that belongs to the data and is needed to find, identify, understand and properly use the data, also in the far future, following the FAIR data principles. An added requirement is that the full data lifecycle should be completely reproducible to enable full trust in the observations and the derived data products.

In this presentation we will introduce the ICOS operational data repository named ICOS Carbon Portal that is based on the linked open data approach. All metadata is modelled in an ontology coded in OWL and based on a RDF triple store that is available through an open SparQL endpoint. The repository supports versioning, collections and models provenance through a simplified Prov-O ontology. All data objects are ingested under strict control for the identified data types on provision of the correct and sufficient (provenance) metadata, data format and data integrity. All data, including raw data, is stored in the long term trusted repository  B2SAFE with two replicates. On top of the triple store and SparQL endpoint we have built a series of services, APIs and graphical interfaces that allow machines to machine and user interaction with the data and metadata. Examples are a full faceted search with connected data cart and download facility, preview of higher level data products (time series of  point observations and spatial data), and cloud computing services like eddy covariance data processing and on demand atmospheric footprint calculations, all connected to the observational data from ICOS.  Another interesting development is the community support for scientific workflows using Jupyter notebook services that connect to our repository through a dedicated python library for direct metadata and data access.

How to cite: Vermeulen, A., Hellström, M., Mirzov, O., Karstens, U., D'Onofrio, C., and Lankreijer, H.: The ICOS Carbon Portal as example of a  FAIR community data repository supporting scientific workflows, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-8458, https://doi.org/10.5194/egusphere-egu21-8458, 2021.