- 1Leibniz Supercomputing Centre (LRZ) of the Bavarian Academy of Sciences and Humanities, Garching b.M., Germany
- 2Technische Universität Dresden, Geoinformatics, Dresden, Germany
At the Leibniz Supercomputing Centre (LRZ) of the Bavarian Academy of Sciences and Humanities, we have set up a portal and system to publish large datasets from simulations, assign DOIs, and present landing pages. This "LRZ FAIR Data Portal" - currently in demonstrator status - is based on InvenioRDM, with the idea of using this framework as a presentation layer. The login and data-upload possibilities typical for repositories are disabled in this setup. Instead, the portal presents metadata together with a dataset-specific link to the GLOBUS (www.globus.org) data-transfer service, where LRZ is connected. By getting themselves a GLOBUS login for free, users can thus reliably copy the data to many other supercomputing centres or download them to their laptop. This system shall make datasets FAIR (Findable, Accessible, Interoperable, Reusable) that are produced at LRZ and are too large to be moved to institutional, community, or general-purpose research-data repositories.
We are currently developing a mechanism to automatically ingest metadata from LRZ storage systems into InvenioRDM. The idea of this mechanism is that users who wish to publish their data store metadata in a DataCite-centric format, and our mechanism scans these user's volumes for metadata to be published. The datasets are thus automatically presented in the portal. The necessary workflows are harmonized and developed with the partners from the Gauss Centre of Supercomputing (HLRS, JSC as the two other largest German supercomputing centres) within the InHPC-DE project. The InvenioRDM instance also provides an OAI-PMH interface, enabling the harvesting of metadata. This allows datasets stored with us to be discoverable in other services, such as earth-data.de. A corresponding exporter that filters datasets according to the Dewey Decimal Classification has been developed as part of the NFDI4Earth project (see NFDI4Earth Knowledge Hub: knowledgehub.nfdi4earth.de).
In the current demonstrator status of our LRZ FAIR Data Portal, the portal and a preliminary push mechanism are used for the publication of the first "friendly-user" datasets. In particular, an ERA5-based downscaled meteorological data suite (3 km resolution, 40 years timespan) has been published via the portal. We report on the experience publishing this dataset, and also further ESS-related datasets from other projects we are working on. The advantages and limitations of the approach are discussed in relation to our concept of a general-purpose data publication mechanism for huge datasets produced at a supercomputing centre. We also shed some light on differences and potential of domain-specific publication mechanisms (including the terrabyte satellite-data platform at LRZ), highlighting how shortcomings of publication approaches can be addressed and opportunities leveraged.
How to cite: Munke, J., Wellmann, A., Muralidharan, M., Henzen, C., and Hachinger, S.: ESS Data Publication at a "General-Purpose" Supercomputing Centre, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-3315, https://doi.org/10.5194/egusphere-egu25-3315, 2025.