- Ifremer, IRSI, SeBiMER Service de Bioinformatique de l'Ifremer, F-29280 Plouzané, France
The study of marine environments increasingly relies on DNA sequencing, whether for the study of an organism (assembly and annotation of its genome) and its biological functions (study of gene expression) or, more broadly, for understanding biodiversity at the scale of an entire ecosystem (via environmental DNA). The use of modern sequencing techniques via miniaturized sequencers that can be used in the field is revolutionizing our understanding of marine ecosystems. For some years now, these techniques have offered relatively easy access to DNA present in samples for the entire water column, from the surface of the Global Ocean to the abyss and its sediments. Brought ashore in biology and bioinformatics laboratories, the exploration of this data offers a unique insight into the immense quantity of microorganisms and their interactions.
In this context, SeBiMER (Ifremer's marine bioinformatics department) has set up a comprehensive FAIR information system to ensure the traceability of environmental DNA sequencing data, from acquisition to analysis and publication. The system is structured around two key components: on one hand, it follows the international standards of the ENA nucleotide library (www.ebi.ac.uk/ena) for collecting, describing, and distributing the Institute's DNA data; on the other hand, it integrates a cutting-edge Nextflow-based pipeline for data analysis. The athENA [1] data management system is based on three complementary software packages: EGIDE, for automated metadata collection and formatting according to ENA checklists; athENA-pipeline, for metadata validation and data upload to ENA; athENA-Manager, a web-based platform for monitoring data projects managed by data managers. This software package forms the SeBiMER information system, which, used in conjunction with the Sextant portal (sextant.ifremer.fr), enables Ifremer's marine bioinformatics data to be managed according to the principles of open science. To date, this protocol has been applied to over 190 reference datasets (genomes, transcriptomes, metabarcoding, metagenomics), enhancing the value and accessibility of marine data.
To enhance the standardization and reproducibility of environmental DNA (eDNA) analyses, SeBiMER has developed a FAIR-compliant workflow named SAMBA [2], built with the Nextflow workflow manager. SAMBA’s modular and flexible design offers researchers a powerful user- friendly solution that can be customized to suit specific research questions and marker genes. Additionally, its comprehensive statistical analysis capabilities support robust interpretation of biological data whatever its origin (sediments, water column, coastal samples) and sampling events schedule, from single event to time series enabling the study of biodiversity over time. By integrating advanced bioinformatics tools within an accessible framework, SAMBA provides a valuable alternative to complex command-line tool suites for eDNA metabarcoding analysis.To date, the athENA+SAMBA framework has been applied on many marine eDNA analysis projects, such as [3,4,5].
References.
[1] https://gitlab.ifremer.fr/bioinfo/workflows/athena
[2] https://gitlab.ifremer.fr/bioinfo/workflows/samba
[3] https://doi.org/10.1016/j.hal.2024.102627
[4] https://doi.org/10.3390/molecules29040774
[5] https://doi.org/10.1016/j.pocean.2023.102999
How to cite: Durand, P., Auffret, P., Goudenege, D., Noel, C., Leroi, L., and Cormier, A.: athENA and SAMBA: the Global Marine Biodiversity Data Management and Analysis Framework, One Ocean Science Congress 2025, Nice, France, 3–6 Jun 2025, OOS2025-1428, https://doi.org/10.5194/oos2025-1428, 2025.