CADS 2.0: A FAIRest Data Store infrastructure blooming in a landscape of Data Spaces.
- ECMWF, Copernicus, Reading, United Kingdom of Great Britain – England, Scotland, Wales (angel.alos@ecmwf.int)
First launched as the Climate Data Store (CDS) supporting the Climate Change Service (C3S) and later instantiated as the Atmosphere Data Store (ADS) for the Atmosphere Monitoring Service (CAMS), the shared underlaying Climate & Atmosphere Data Store Infrastructure (CADS) represents the technical backbone for the implementation of Copernicus services entrusted to ECMWF on behalf of the European Commission. CDS in addition also offer access to a selection of datasets from the Emergency Management Service (CEMS). As the flagship instance of the infrastructure, CDS counts with more than 160k registered users and delivers a daily average over 100 TBs of data from a catalogue of 141 datasets.
CADS Software Infrastructure is designed as a distributed system and open framework that facilitates improved access to a broad spectrum of data and information via a powerful service-oriented architecture offering seamless web-based and API-based search and retrieve capabilities. CADS also provides a generic software toolbox that allow users to make use of available datasets and a series of state-of-the-art data tools that can be combined into more elaborated processes, and present results graphically in the form of interactive web applications. CADS Infrastructure is hosted in an on-premises Cloud physically located within ECMWF Data Centre and implemented using a collection of virtual machines, networks and large data volumes. Fully customized instances of CADS, including dedicated Virtual Hardware Infrastructure, Software Application and Catalogued content can be easily deployed thanks to implemented automatization and configuration software tools and a set of configuration files which are managed by a distributed version control system. Tailored scripts and templates allow to easily accommodate different standards and interoperate with external platforms.
ECMWF in partnership with EUMETSAT, ESA and EEA also implement the Data and Information Access Services (DIAS) platform called WEkEO, a distributed cloud-computing infrastructure used to process and make the data generated by Copernicus Services accessible to users together with derived products and all satellite data from the Copernicus Sentinels. Within the partnership ECMWF is responsible for the procurement of the software to implement Data Access Services, Processing and Tools which specifications build on the same fundamentals than CADS. Adoption of FAIR principles has demonstrated cornerstone to maximize synergies and interactions between CADS, WEkEO and other related platforms.
Driven by the increasing demand and the evolving landscape of platforms and services a major project for the modernization of the CADS infrastructure is currently underway. The coming CADS 2.0 aims to capitalize experience, feedbacks, lesson learned, know-how from current CADS, embrace advanced technologies, engage with a broader user community, make the current platform more versatile and cloud oriented, improve workflows and methodologies, ensure compatibility with state-of-the-art solutions such as machine learning, data cubes and interactive notebooks, consolidate the adoption of FAIR principles and strength synergies with related platforms.
As complementary Infrastructures, WEkEO will allow users to harness compute resources without the networking and storage costs associated with public Cloud offerings in where CADS Toolbox 2.0 will deploy and run allowing heavy jobs (retrieval and reduction) to be submitted to CADS 2.0 core infrastructure as services.
How to cite: lopez alos, A., raoult, B., comyn-platt, E., and varndell, J.: CADS 2.0: A FAIRest Data Store infrastructure blooming in a landscape of Data Spaces., EGU General Assembly 2023, Vienna, Austria, 24–28 Apr 2023, EGU23-3657, https://doi.org/10.5194/egusphere-egu23-3657, 2023.