- 1Université Gustave Eiffel, Inria, COSYS Laboratory, I4S Team, F-44344, Bouguenais, France
- 2Université Gustave Eiffel, Inria, COSYS Laboratory, I4S Team, F-35700, Rennes, France
Abstract
Efficient and secure dataset management is a critical component of collaborative research projects, where diverse data types, sharing requirements, and compliance regulations converge. This work presents a dataset management tool entitled DAM2 (Data and Model Monitoring) developed within the Chips Joint Undertaking (Chips JU) funded European BRIGHTER project [1], to address these challenges. It provides a robust and adaptable solution for handling private and public ground based measurements datasets throughout the project lifecycle. These datasets combine infrared images (e.g. multispectral ones), with visible images, local weather measurements, labeled data, etc.
The tool is designed to ensure rights management, enabling selective data sharing among authorized partners based on predefined permissions. It incorporates secure access controls to safeguard sensitive data and meets GDPR (General Data Protection Regulation) requirements to guarantee compliance with European privacy standards. For public datasets, the tool integrates with Zenodo, an open-access repository, to support long-term storage and accessibility, aligning with the principles of open science. Key technical features include the usage of an open source, S3 compatible object storage server (MinIO [2]) providing scalability to manage large volumes of data. Additionally, the use of Zarr [3] data format behind the scene offers significant advantages for this cloud-based data management tool, including efficient storage of large datasets through chunking and compression, fast parallel read and write operations, and compatibility with a wide range of data analysis tools. The tool adheres to FAIR (Findable, Accessible, Interoperable, Reusable) principles, storing metadata alongside datasets to enhance usability and interoperability.
Developed as an open-source platform, the tool promotes transparency and collaboration while providing a complete and well-documented API for seamless integration with other systems. A user-friendly interface ensures accessibility for stakeholders with varying technical expertise, while the tool remains flexible to accommodate additional file formats as required. The development process incorporates insights from relevant COFREND (French Confederation for Non-Destructive Testing) working groups, to ensure alignment with broader initiatives in data management, interoperability and durability.
This paper addresses the design, study and developed platform. First operational functionalities are demonstrated through the manipulation of first BRIGHTER and other research project datasets.
In conclusion, DAM2 is a comprehensive solution for managing diverse datasets in collaborative projects, balancing security, compliance, and accessibility. It provides a foundation for efficient, compliant, and interoperable data handling while supporting the principles of open science and FAIR data management.
Perspectives include expanding interoperability with additional repositories, incorporating advanced analytic and visualization features, and integrating AI-driven automation.
Acknowledgments
Authors would like to acknowledge the BRIGHTER HORIZON project. BRIGHTER has received funding from the Chips Joint Undertaking (JU) under grant agreement No 101096985. The JU receives support from the European Union’s Horizon Europe research and innovation program and France, Belgium, Portugal, Spain, Turkey.
References
[1] Brighter --- Project-Brighter. https://project-brighter.eu/, accessed on January 2025.
[2] MinIO, Inc. MinIO S3 Compatible Storage for AI --- Min.Io. https://min.io/, accessed on January, 2025.
[3] Zarr --- Zarr.dev. https://zarr.dev/, accessed on January, 2025.
How to cite: Dumoulin, J., Toullier, T., Gey, N., and Malandain, M.: DAM2 — A Scalable and Compliant Solution for Managing enriched Infrared images as FAIR Research Data , EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-11937, https://doi.org/10.5194/egusphere-egu25-11937, 2025.