EGU23-3269
https://doi.org/10.5194/egusphere-egu23-3269
EGU General Assembly 2023
© Author(s) 2023. This work is distributed under
the Creative Commons Attribution 4.0 License.

HCDC datasearch portal: Replacing legacy solutions with a unified open-source portal

Linda Baldewein, Housam Dibeh, Philipp S. Sommer, and Ulrike Kleeberg
Linda Baldewein et al.
  • Helmholtz-Zentrum Hereon, Geesthacht, Germany (linda.baldewein@hereon.de)

In Earth System Sciences, new data portals are currently being developed by what seems to be each new project and research initiative. But what happens to already existing solutions that are in a dire need of a software update? We will introduce the HCDC datasearch portal (https://hcdc.hereon.de/datasearch/), an open-source software solution, that combines data from a legacy database, file storage systems, OGC conform web services and a World Data Center. Our portal provides a common interface for all our heterogeneous data-sources to select and to download the data-products based on filters for metadata and spatio-temporal information.

Three legacy portal solutions at Helmholtz-Zentrum Hereon are replaced by a scalable and easily extendable new portal based on an Elasticsearch cluster in the back-end and a user-friendly web interface as well as a machine readable API in the front-end. To ensure software that fits the user’s workflows, a stakeholder group was involved from the early stages of the planning up until the release of the final product.

Extensibility of the portal is ensured by only storing metadata within the portal. Data access and download is configured based on each decentralized storage solution, e.g. a local database or a World Data Center. Harmonization of metadata is crucial for the user experience of the portal. We limited the searchable metadata to 14 fields in addition to geospatial and temporal metadata, including information such as the platform from which the data originates and the parameter that was measured. Whenever possible, controlled vocabularies were used. Due to the heterogeneity of the data, including climate model results as well as long-tail biogeochemical campaign data, this is an ongoing process.

The HCDC datasearch portal provides an example of the challenges and opportunities of combining data from distributed data sources through a single entry-point based on state-of-the-art web technologies. It can be used to discuss the challenges of re-using legacy solutions in a continually progressing research data infrastructure world.

How to cite: Baldewein, L., Dibeh, H., Sommer, P. S., and Kleeberg, U.: HCDC datasearch portal: Replacing legacy solutions with a unified open-source portal, EGU General Assembly 2023, Vienna, Austria, 24–28 Apr 2023, EGU23-3269, https://doi.org/10.5194/egusphere-egu23-3269, 2023.