EGU26-11768, updated on 14 Mar 2026
https://doi.org/10.5194/egusphere-egu26-11768
EGU General Assembly 2026
© Author(s) 2026. This work is distributed under
the Creative Commons Attribution 4.0 License.
Poster | Friday, 08 May, 16:15–18:00 (CEST), Display time Friday, 08 May, 14:00–18:00
 
Hall X4, X4.100
A Cloud-Native GNSS Data Lakehouse for Scalable Ingestion, Processing, and Analysis
Nils Brinckmann and Markus Bradke
Nils Brinckmann and Markus Bradke
  • GFZ Helmholtz Centre for Geosciences, Geoinformation, Potsdam, Germany (nils.brinckmann@gfz-potsdam.de)

The rapid growth of Global Navigation Satellite System (GNSS) observations, driven by dense station networks, high-rate data streams, and the modernisation of satellite constellations places increasing demands on data centers in terms of scalability, reliability, and reproducibility. Traditional monolithic GNSS data management systems are often difficult to scale and adapt to evolving processing and analysis workflows. To address these challenges, we are developing a cloud-native GNSS data center architecture based on container orchestration and streaming technologies.

Our system is built on Kubernetes to enable flexible deployment, horizontal scalability, and fault tolerance of GNSS services. Data ingestion is handled through Apache Kafka, which provides a robust, high-throughput messaging backbone for streaming GNSS observations from heterogeneous sources. This approach decouples data producers and consumers, allowing independent scaling of ingestion, processing, and downstream analytics.

For long-term storage and analytical access, GNSS data are ingested via ETL pipelines into an Apache Iceberg data lakehouse. Iceberg provides schema evolution, partition management, and ACID (Atomicity, Consistency, Isolation, and Durability) guarantees, enabling efficient access to large, time-series GNSS datasets for both batch and interactive analysis.

System performance, data flow, and service health are continuously monitored using Prometheus, with operational and scientific metrics visualized through Grafana dashboards. This monitoring framework facilitates operational stability, performance optimization, and transparent reporting of data latency and availability.

We present the overall system design, implementation details, and initial performance results, and discuss how this architecture improves scalability, resilience, and reproducibility compared to conventional GNSS data centers. The proposed approach provides a flexible foundation for next-generation GNSS services and can be extended to other geodetic and Earth observation data streams.

How to cite: Brinckmann, N. and Bradke, M.: A Cloud-Native GNSS Data Lakehouse for Scalable Ingestion, Processing, and Analysis, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-11768, https://doi.org/10.5194/egusphere-egu26-11768, 2026.