- 1DKRZ - German Climate Computing Center, Hamburg, Germany
- 2TIB - Leibniz Information Centre for Science and Technology, Hannover
- 3GEOMAR Helmholtz Centre for Ocean Research Kiel, Kiel, Germany
- 4RWTH Aachen University, Aachen, Germany
- 5Senckenberg - Leibniz Institution for Biodiversity and Earth System Research (SGN), Frankfurt/Main, Germany
While the goal for Earth System Sciences (ESS) is a seamless, machine-actionable and AI-ready data ecosystem, the reality is different. Current infrastructures are often isolated into siloed "islands" and large data “continents” (Figure 1), separated by inconsistent technical standards and metadata conventions - like a "data archipelago". Despite major investments from consortia like NFDI4Earth in Germany and Data Terra in France, these fragments are only loosely connected by semantic bridges, leaving the ESS community with a landscape of scattered repositories rather than a unified digital environment.
To address this growing fragmentation of the ESS data landscape, the BITS 2.0 project aims to establish a Semantic Fabric for ESS. Instead of merely linking individual repositories, this approach overlays heterogeneous data holdings with an intelligent, semantic shared layer. Building on the original BITS project, which successfully established a quality-controlled hub of ESS terminologies, BITS 2.0 will develop advanced, AI-powered data annotation services combined with a sustainable, community-driven governance model.
These tools will analyze and enrich diverse data assets - ranging from well-curated repositories and institutional data lakes to individual catalogues - with consistent, interoperable, and machine-actionable metadata. For researchers, this substantially lowers barriers to data discovery, integration, and reuse across sources, enabling more efficient workflows and robust cross-domain analyzes. As an initial implementation, BITS 2.0 packages will be deployed on various types of data holdings. These include 'data continents', characterised by large, highly standardized data volumes serving multiple use cases (DKRZ), and 'data islands', consisting of smaller, project-specific datasets with heterogeneous or inconsistent standardization (GEOMAR). Based on these developments, BITS 2.0 will develop AI-empowered Blueprints 2.0 that provide a broadly transferable methodology for semantic integration across these scenarios (SGN, RWTH, TIB).
BITS 2.0 is envisioned as a trusted semantic enabler for the emerging hybrid ESS data space, providing the essential “semantic glue” required for meaningful interoperability. By transforming a fragmented infrastructure landscape into a coherent, searchable knowledge space, BITS 2.0 will support the combined use of larger and more diverse datasets to address complex Earth System research questions.
Figure 1: The scattered landscape of ESS data infrastructures, with varying challenges for semantic integration and AI-readiness depending on architectural design, depicted here as “data islands”, “data continents” and “data archipelagos”.
How to cite: Lammert, A., Anders, I., Ganske, A., Geisler, S., Kraft, A., Martens, C., Mehrtens, H., Söding, E., Thiemann, H., Weiland, C., and Wolodkin, A.: Blueprints for Semantic Integration and AI-Readiness (BITS 2.0), EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-10237, https://doi.org/10.5194/egusphere-egu26-10237, 2026.