- 1MARIS, Nootdorp, Netherlands (tjerk@maris.nl)
- 2MARIS, Nootdorp, Netherlands (peter@maris.nl)
- 3MARIS, Nootdorp, Netherlands (dick@maris.nl)
- 4MARIS, Nootdorp, Netherlands (robin@maris.nl)
- 5MARIS, Nootdorp, Netherlands (paul@maris.nl)
In order to provide users with fast and easy access to multidisciplinary data originating from large collections, MARIS has developed a software system called Beacon that can, on the fly with high performance, extract specific data based on the user’s request. This software has been customised and deployed in the Blue-Cloud2026 project and several other European projects and is designed to return one single harmonised file as output, regardless of whether the input contains different data types. Beacon is fully open-source (AGPLv3) available, allowing everyone to set-up their own Beacon ‘node’ to enhance the access to their data or use existing Beacon nodes from well-known data infrastructures such as Euro-Argo, ERA5 or the World Ocean Database for fast and easy access to harmonized data subsets. More technical details, example applications and general information on Beacon can be found on the website https://beacon.maris.nl/.
Within the context of Blue-Cloud2026, Beacon is deployed to provide access to harmonised subsets from Blue Data Infrastructures for the WorkBenches (WB) that aim to generate harmonised and validated data collections of Essential Ocean Variables (EOVs). To this end a set of monolithic Beacon nodes are set-up for relevant data collections such as the WOD, CMEMS Cora, Euro-Argo and more. These are made available on the D4Science e-infrastructure as part of the Blue-Cloud VRE, giving access to all users registered as Blue-Cloud users.
Going one step further, the output from multiple monolithic Beacon instances are combined into one merged Beacon node for each WB. This merged node includes a structural mapping from each monolithic Beacon to the target Common Metadata Profile as defined by the WB teams. These mappings are used in the Beacon queries to retrieve and load contents ‘as-is’ from monolithic Beacon instances into the merged Beacon instances, giving a common structure for variables, units, values, quality flags, and common metadata profile fields. The structured metadata and data are supplemented by additional metadata data as available for each of the monolithic Beacon instances.
This presentation will cover an overview of the Blue-Cloud 2026 project and developments of the merged Beacon nodes, explaining how it can practically serve as data lakes for many VRE applications and how it is extendable to other domains. By using examples from the WBs, the reduction in time and effort spent for the researchers to collect the data are highlighted.
How to cite: Krijger, T., Thijsse, P., Schaap, D., Kooyman, R., and Weerheim, P.: Blue-Cloud2026 project - Deploying Beacon data lakes for harmonizing ocean data access for Virtual Research Environments, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-10490, https://doi.org/10.5194/egusphere-egu26-10490, 2026.