Knowing the type and distribution of seafloor sediments is crucial for many purposes, including marine spatial planning and nature conservation. Seabed sediment maps are typically obtained by manually or automatically classifying data recorded by swath sonar systems such as multibeam echosounders (MBES), aided with ground-truth data.
While progress has been made to map the seafloor based on acoustic data in an automated way, such methods have not advanced enough to become operational for routine map production in geological surveys. Mapping seafloor sediments is therefore still a manual and partly subjective process, which may imply a lack of repeatability.
In recent years, deep learning using convolutional neural networks (CNNs) has shown great promise for image classification applied in domains such as satellite or biomedical image analysis, and there is an increasing interest in the use of CNNs for seabed image classification.
In this work, we evaluate the performance of semantic segmentation using a U-Net CNN for the purpose of classifying seafloor acoustic images into sediment types.
Our study site is an area of 576 km2, located in the Søre Sunnmøre region, where seafloor sediments have been manually mapped by the Geological Survey of Norway (NGU). For our initial investigation, we simplified the NGU map into two classes – soft sediment and hard substrate – and trained multiple U-Net networks to predict the sediment classes using an MBES bathymetry grid and seabed backscatter image mosaic as source datasets. Our training reference was the expertly delineated sediment map, and the method thus seeks to mimic the human observer. Our initial analysis derived features directly from acoustic backscatter and bathymetry data but also derived slope and hillshade images from the bathymetry grid.
The MBES imagery was pre-processed and divided into patches of 256 m x 256 m (where 1 m = 1 image pixel). We evaluated models using a single input layer, e.g., backscatter mosaic, bathymetry grid, hillshade or slope respectively, and three models that used two input layers, hillshade & depth, hillshade & backscatter, slope & backscatter. Performance was evaluated using the Dice score (DS), a relative measure of overlap between the predicted and reference map.
Interestingly, results showed that for models using a single data source, the hillshade and slope models produced the highest performance with a DS of approximately 0.85, followed by the backscatter model (DS = 0.8) and the depth model with a DS of 0.7. Models using dual data sources showed improved results for the backscatter/slope & depth model (DS = 0.9) while showing a lower DS (0.7) for the hillshade & depth model.
Our preliminary results demonstrate the potential of using a U-Net to classify seafloor sediments from MBES data, thus far using two sediment classes. Assuming here that the human observer has correctly annotated the seabed sediments, such an approach could help to automate seafloor mapping in future applications. Further work will provide an in-depth analysis on feature importance, further improve the models by using additional input layers, and use data where several relevant sediment classes are included.