Session ITS1.2/OS4.10

[Programme]

ITS1.2/OS4.10 | Machine Learning for ocean science

Orals |

Fri, 08:30

Posters on site |

Thu, 16:15

Posters virtual

Thu, 14:00

Machine Learning for ocean science

Convener: Julien Brajard | Co-conveners: Aida Alvera-Azcárate, Rachel Furner, Redouane LguensatECSECS

Orals

| Fri, 19 Apr, 08:30–12:30 (CEST)

Room E2

Posters on site

| Attendance Thu, 18 Apr, 16:15–18:00 (CEST) | Display Thu, 18 Apr, 14:00–18:00

Hall X5

Posters virtual

| Attendance Thu, 18 Apr, 14:00–15:45 (CEST) | Display Thu, 18 Apr, 08:30–18:00

vHall X4

Session assets

Session materials

Orals: Fri, 19 Apr | Room E2

Chairpersons: Julien Brajard, Rachel Furner

DATA ASSIMILATION

08:30–08:40

EGU24-4587

On-site presentation

Applying Deep-learning Models in Observation Simulation Experiments of Throughflows Across the Indonesian Seas

Huijie Xue, Zihao Wang, and Yuan Wang

08:40–08:50

EGU24-17731

On-site presentation

Towards an Observation Operator for Satellite Retrievals of Sea Surface Temperature with Convolutional Neural Network

Matteo Broccoli, Andrea Cipollone, and Simona Masina

Global ocean numerical models typically have their first vertical level about 0.5m below the sea surface. However, a key physical quantity like the sea surface temperature (SST) can be retrieved from satellites at a reference depth of a few microns or millimeters below the sea surface. Assimilating such temperatures can lead to bias in the ocean models and it is thus necessary to project the satellite retrievals to the first model level to safely use them in the assimilation process. This projection is non-trivial, since it depends on several factors (e.g., daily cycle, winds, latitude) and it is usually performed either with computationally expensive numerical models or with too simple statistical methods.

In this work we present an attempt to construct the projection operator with machine learning techniques. We consider three different networks: a convolutional neural network architecture called U-Net, which was first introduced in the field of computer vision and image segmentation, and it is thus optimal to process satellite retrievals; a pix2pix network, which is a U-Net trained in an adversarial way against a patch-classifier discriminator; a random forest model, which is a more traditional machine learning technique. We train the networks with L3 global subskin SST from AVHRR’s infrared channels on MetOp satellites produced by OSISAF and wind speed analysis at 10m by ECMWF to reproduce the ESA SST CCI and C3S global SST reprocessed product by CMEMS, that we take as ground truth during training and validation. The pix2pix network is the most effective in the projection and we thus choose it to shape an observation operator for the CMCC’s OceanVar assimilation system.

Finally, we compare several one-year-long reanalysis-like experiments, based on the CMCC reanalysis system, that assimilate the SST in different ways, e.g. nudging, unbiased approach, as observation operator. We discuss the potential impact of such new scheme in providing the best surface ocean state estimate.

How to cite: Broccoli, M., Cipollone, A., and Masina, S.: Towards an Observation Operator for Satellite Retrievals of Sea Surface Temperature with Convolutional Neural Network, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-17731, https://doi.org/10.5194/egusphere-egu24-17731, 2024.

08:50–09:00

EGU24-17199

ECS

Highlight

On-site presentation

A Multi-Fidelity Ensemble Kalman Filter with a machine learned surrogate model

Jeffrey van der Voort, Martin Verlaan, and Hanne Kekkonen

09:00–09:07

Q&A

ML FOR INSIGHTS

09:07–09:17

EGU24-21905

On-site presentation

Discovering Dominant Controls on Southern Ocean Dynamics Under Climate Change: New Knowledge Through Physics-Guided Machine Learning

Maike Sonnewald, William Yik, Mariana CA Clare, and Redouane Lguensat

The Southern Ocean closes the global overturning circulation and is key to the regulation of carbon, heat, biological production, and sea level. However, the dynamics of the general circulation and its leading order controls remain poorly understood, in part because of the challenge of characterizing and tracking changes in ocean physics in complex models. This gap in understanding is especially problematic in the face of climate change. Here, we wish to understand changes in the dynamics of the Southern Ocean under climate change, specifically how bathymetric controls on the general circulation could impact the location of major currents and impact upwelling. We use a suite of CMIP models for our analysis. A physics-informed equation discovery framework guided by machine learning is used to partition and interpret dynamics is used to understand spatial structures, and a supervised learning framework that quantifies its uncertainty and provides explanations of its predictions is leveraged to track change. The method, called Tracking global Heating with Ocean Regimes (THOR). A region undergoing a profound shift is where the Antarctic Circumpolar Current intersects with bathymetry, for example, the Pacific-Antarctic Ridge. We see major changes in areas associated with upwelling between the CMIP models, suggesting the changes in wind stress allow the control bathymetry has in the historical scenario to change. For example, we find that as the Antarctic Circumpolar Current shifts north under intensifying wind stress, when meeting the Pacific-Antarctic Ridge. We note associated change in the regions where gyre circulation favors upwelling, with spatial distributions varying between models. Our efforts go towards a better understanding of what dynamics are driving changes, and could allow reduction of bias between models and decrease uncertainties in future projections.

How to cite: Sonnewald, M., Yik, W., Clare, M. C., and Lguensat, R.: Discovering Dominant Controls on Southern Ocean Dynamics Under Climate Change: New Knowledge Through Physics-Guided Machine Learning , EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-21905, https://doi.org/10.5194/egusphere-egu24-21905, 2024.

09:17–09:27

EGU24-120

ECS

On-site presentation

Exploring Regional Ocean Climate Variability: Insights from Integrated Clustering and Principal Component Analysis.

Cristina Radin and Veronica Nieves

Ocean regional climate variability is a part of the Earth's complex system that can influence the occurrence and intensity of extreme weather events. Variability in ocean temperature can either amplify or mitigate the impact of these events. For example, the El Niño phenomena affect weather conditions in various parts of the world, leading to droughts, floods, and altered precipitation patterns. Furthermore, regional climate variability is also linked to changes in sea level. Understanding regional variability is crucial for predicting how sea level changes will vary in different parts of the world, which has profound implications for coastal communities and infrastructure. To contribute to this understanding, we have developed a novel method that combines K-means clustering and Principal Component Analysis to extract ocean climate modes at a regional scale worldwide. This integrated approach automatically identifies regions of variability, allowing for the emulation of coastal and regional sea level variations across multiple timescales. It also has the potential to offer valuable insights into the significance of temperature across multiple depth layers extending up to 700 meters. The produced set of regional sea-level emulators are a complementary source of information in coastal areas, especially in situations where satellite altimetry encounters challenges and/or tide-gauge sensor records are incomplete, thereby supporting well-informed decision-making.

How to cite: Radin, C. and Nieves, V.: Exploring Regional Ocean Climate Variability: Insights from Integrated Clustering and Principal Component Analysis., EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-120, https://doi.org/10.5194/egusphere-egu24-120, 2024.

09:27–09:37

EGU24-8942

ECS

On-site presentation

Chlorophyll-a satellite climate time series: How machine learning can help distinguish between bias and consistency

Etienne Pauthenet, Elodie Martinez, Thomas Gorgues, Joana Roussillon, Lucas Drumetz, Ronan Fablet, and Maïlys Roux

Phytoplankton sustains marine ecosystems and influences global carbon dioxide levels through photosynthesis. To grow, phytoplankton rely on nutrient availability in the upper sunlit layer, closely related to ocean dynamics and specifically ocean stratification. Human-caused climate change is responsible, among others, for an increase in global temperature and regional modifications of winds, thus affecting the stratification of the ocean's surface. Consequently, phytoplankton biomass is expected to be impacted by these environmental changes. While most existing studies focus on one or two satellite products to investigate phytoplankton trends in the global ocean, in this study, we analyze surface chlorophyll-a concentration (chl-a), a proxy for phytoplankton biomass, using six merged satellite products from January 1998 to December 2020. Significant regional discrepancies are observed among the different products, displaying opposing trends. To distinguish trends arising from changes in the physical ocean from those potentially resulting from sensor biases, a convolutional neural network is employed to examine the relationship between chl-a and physical ocean variables (sea surface temperature, sea surface height, sea surface currents, wind, and solar radiation). The training is conducted over 2002-2009 when the number of merged sensors is constant, and chl-a is reconstructed over 2010-2020. Our results suggest that the merging algorithm of the Globcolour Garver, Siegel, Maritorena (GSM) bio-optical model is not reliable for trend detection. Specifically, changes in chl-a after 2016 are not supported by changes in the physical ocean but rather by the introduction of the VIIRS sensor. These results emphasize the need for a careful interpretation of chl-a trends and highlight the potential of machine learning to study the evolution of marine ecosystems.

How to cite: Pauthenet, E., Martinez, E., Gorgues, T., Roussillon, J., Drumetz, L., Fablet, R., and Roux, M.: Chlorophyll-a satellite climate time series: How machine learning can help distinguish between bias and consistency, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-8942, https://doi.org/10.5194/egusphere-egu24-8942, 2024.

09:37–09:44

Q&A

PHYSICS INFORMED METHODS

09:44–09:54

EGU24-3372

ECS

On-site presentation

Two Physics-informed Enso Deep Learning Forecasting Models: ENSO-ASC and ENSO-GTC

Bo Qin

El Niño-Southern Oscillation (ENSO) events have significant impacts on global climate change, and the research on their accurate forecasting and dynamic predictability holds remarkable scientific and engineering values. Recent years, we have constructed two ENSO deep learning forecasting models, ENSO-ASC and ENSO-GTC, which are both incorporated with prior ENSO dynamic mechanisms. Specifically, the former possesses the multivariate air-sea coupler (ASC), which can simulate the occurrence and decay of ENSO events, accompanied by concurrent energy interactions among multiple physical variables in the Pacific Ocean. The latter possesses the global teleconnection coupler (GTC), which can modulate the significant teleconnections of global ocean basins rather than the isolated interactions in the Pacific Ocean. From the perspective of forecasting skill, the Niño 3.4 index correlation skills of these two models can reach 0.78/0.65/0.50 (0.79/0.66/0.51) in 6/12/18 lead-month prediction, which means they exhibit an effective forecasting lead month of more than 18, outperforming the Ham et al.'s Nature-published ENSO forecasting model. The test of the past year's (2022) forecast results shows that the average forecast error of these two models is 0.156, which is less than 10% of the actual ENSO amplitudes. It is worth noting that these two models also encounter the spring presistence barrier (SPB), but indicates a profound improvement compared to the numerical models. From the perspective of ENSO predictability, zonal and meridional winds are two sensitive predictors for ENSO forecasting besides sea surface temperature (SST), which greatly contribute to the Bjerknes positive feedback mechanism and WES mechanism. Walker circulation, acting as the "atmpsphric bridge", induces the teleconnections of the three oceans, which can derive the easterly wind anomalies in the equatorial western Pacific from the Indian Ocean and North Pacific meridional mode in the northeastern Pacific from the Atlantic Ocean, promoting ENSO event development and decay.

How to cite: Qin, B.: Two Physics-informed Enso Deep Learning Forecasting Models: ENSO-ASC and ENSO-GTC, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-3372, https://doi.org/10.5194/egusphere-egu24-3372, 2024.

09:54–10:04

EGU24-18663

ECS

On-site presentation

Linking Satellite and physics-informed Data with Phytoplankton communities Using Deep Learning

Luther Ollier, Roy El Hourany, and Marina Levy

Understanding Phytoplankton community dynamics in response to environmental shifts is crucial for assessing the impact of climate change on marine biology. To this end, satellite observations offer a dataset spanning two decades, capturing diverse sea surface parameters, including temperature, ocean color, and surface height. Notably, ocean color data is processed to derive sea surface chlorophyll-a concentration, widely acknowledged as a reliable proxy for phytoplankton biomass.

Lately, advances in ocean color observation allow us to describe the phytoplankton community structure in terms of groups (broad functional or taxonomic groups) or size classes. Although these advances provide more detailed information on phytoplankton diversity and structure, these datasets suffer from spatial and temporal coverage limitations due to strict quality control in the presence of atmospheric aerosols, clouds, sea ice, etc... As a result, studies examining phytoplankton trends over the past two decades and future projections rely on incomplete chlorophyll-a and ocean color data. Therefore this compromises the identification of consistent trends within phytoplankton datasets.

In this study, we address this issue using a deep-learning approach. Our method constructs an attention network that learns from the available satellite dataset of Chla and phytoplankton size classes images (weekly and one-degree-degraded spatial resolution) while assimilating information from gap-free sea surface physics data originating from satellite observations and assimilated numerical models). The primary objective is to estimate the phytoplankton dataset based on the knowledge of physical factors, while filling the gaps within this dataset

The trained deep-learning model allows us to discern patterns and correlations between chlorophyll concentration and the phytoplankton size classes on one hand, and the physics-based data on the other hand. From a phytoplankton weekly database spanning from 1997 to 2020, with 50% missing pixels, our approach demonstrates promising results in replicating chlorophyll concentration and accurately inferring phytoplankton size classes.

The methodology shows the potential of deep-learning for robust ecological applications but mainly lays the groundwork for future trend studies on phytoplankton communities.

How to cite: Ollier, L., El Hourany, R., and Levy, M.: Linking Satellite and physics-informed Data with Phytoplankton communities Using Deep Learning, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-18663, https://doi.org/10.5194/egusphere-egu24-18663, 2024.

10:04–10:10

Q&A

CARBON CYCLE

Coffee break

Chairpersons: Aida Alvera-Azcárate, Redouane Lguensat

10:45–10:55

EGU24-13571

ECS

Virtual presentation

Application of a Neural Network Algorithm to Estimate the Nutrients Concentration in the Peruvian Upwelling System

Cristhian Asto, Anthony Bosse, Alice Pietri, François Colas, Raphaëlle Sauzède, and Dimitri Gutiérrez

The Peruvian coastal upwelling system (PCUS) is one of the most productive in the world ocean. The Peruvian Marine Research Institute (IMARPE) has been monitoring the PCUS since the 1960’s with an increase in the frequency and spatial distribution of measurements since the early 2000’s focusing on temperature, salinity and oxygen. In recent years, autonomous gliders have started to be routinely deployed by IMARPE, collecting a large amount of profiles. However, there is still a gap for the high-resolution sampling of biogeochemical parameters such as nutrients (nitrate, phosphate and silicate).

New methods using machine learning to reconstruct missing data have been developed recently with promising results (Sauzède et al, 2017; Bittig et al., 2018; Fourrier et al., 2020). In particular, a recent global approach using neural networks (NN) named CANYON-B (CArbonate system and Nutrientes concentration from hYdrological properties and Oxygen using a Neural network) was developed in order to fill those gaps and infer nutrients’ concentrations from the more sampled variables of temperature, salinity and oxygen (Bittig et al., 2018).

In this work we show the application of this global CANYON-B algorithm to the PCUS using all the historical IMARPE’s CTD casts. Moreover, we trained a new NN and compared its outputs with the ones from CANYON-B, demonstrating the benefits of training the NN with the extensive regional data set collected by IMARPE.

How to cite: Asto, C., Bosse, A., Pietri, A., Colas, F., Sauzède, R., and Gutiérrez, D.: Application of a Neural Network Algorithm to Estimate the Nutrients Concentration in the Peruvian Upwelling System, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-13571, https://doi.org/10.5194/egusphere-egu24-13571, 2024.

10:55–11:05

EGU24-6735

On-site presentation

Analyzing Zooplankton grazing spatial variability in the Southern Ocean using deep learning

Gian Giacomo Navarra, Aakash Sane, and Curtis Deutsch

To elucidate the complex dynamics of zooplankton grazing and its impact on the organic carbon pump, we leveraged machine learning algorithms to analyze extensive datasets encompassing zooplankton behavior, environmental variables, and carbon flux measurements. Specifically, we employed regression models to establish predictive relationships between zooplankton grazing rates and key environmental factors, such as Potential Temperature, Sea Ice extension and iron availability.

The results demonstrate the potential of machine learning in discerning patterns and nonlinear relationships within the data, offering insights into the factors influencing zooplankton grazing dynamics. Additionally, the models provide a predictive framework to estimate the contribution of zooplankton to the organic carbon pump under varying environmental conditions. We have further analyzed the results by using two explainable AI methods, the Layer Wise Relevance Propagation and Integrated Gradients that informs which physical variables contribute to the prediction.

This research contributes to our understanding of the intricate processes governing carbon sequestration in the ocean, with implications for climate change mitigation and marine ecosystem management. Machine learning techniques assists to unravel the complexities of zooplankton-mediated carbon flux, to unravel the complexities of zooplankton-mediated carbon flux, paving the way for more accurate predictions and proactive conservation strategies in the face of global environmental changes.

How to cite: Navarra, G. G., Sane, A., and Deutsch, C.: Analyzing Zooplankton grazing spatial variability in the Southern Ocean using deep learning, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-6735, https://doi.org/10.5194/egusphere-egu24-6735, 2024.

11:05–11:15

EGU24-14839

ECS

Virtual presentation

Near-real-time monitoring of global ocean carbon sink based on CNN

Piyu Ke, Xiaofan Gui, Wei Cao, Dezhi Wang, Ce Hou, Lixing Wang, Xuanren Song, Yun Li, Biqing Zhu, Jiang Bian, Stephen Sitch, Philippe Ciais, Pierre Friedlingstein, and Zhu Liu

The ocean plays a critical role in modulating climate change by absorbing atmospheric CO2. Timely and geographically detailed estimates of the global ocean-atmosphere CO2 flux provide an important constraint on the global carbon budget, offering insights into temporal changes and regional variations in the global carbon cycle. However, previous estimates of this flux have a 1 year delay and cannot monitor the very recent changes in the global ocean carbon sink. Here we present a near-real-time, monthly grid-based dataset of global surface ocean fugacity of CO2 and ocean-atmosphere CO2 flux data from January 2022 to July 2023, which is called Carbon Monitor Ocean (CMO-NRT). The data have been derived by updating the estimates from 10 Global Ocean Biogeochemical Models and 8 data products in the Global Carbon Budget 2022 to a near-real-time framework. This is achieved by employing Convolutional Neural Networks and semi-supervised learning methods to learn the non-linear relationship between the estimates from models or products and the observed predictors. The goal of this dataset is to offer a more immediate, precise, and comprehensive understanding of the global ocean-atmosphere CO2 flux. This advancement enhances the capacity of scientists and policymakers to monitor and respond effectively to alterations in the ocean's CO2 absorption, thereby contributing significantly to climate change management.

How to cite: Ke, P., Gui, X., Cao, W., Wang, D., Hou, C., Wang, L., Song, X., Li, Y., Zhu, B., Bian, J., Sitch, S., Ciais, P., Friedlingstein, P., and Liu, Z.: Near-real-time monitoring of global ocean carbon sink based on CNN, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-14839, https://doi.org/10.5194/egusphere-egu24-14839, 2024.

11:15–11:24

Q&A

MULTISOURCE DATA

11:24–11:34

EGU24-16166

ECS

On-site presentation

A global daily gap-filled chlorophyll-a dataset in open oceans during 2001–2021 from multisource information using convolutional neural networks

Zhongkun Hong, Di Long, Xingdong Li, Yiming Wang, Jianmin Zhang, Mohamed Hamouda, and Mohamed Mohamed

Ocean color data are essential for developing our understanding of biological and ecological phenomena and processes and also of important sources of input for physical and biogeochemical ocean models. Chlorophyll-a (Chl-a) is a critical variable of ocean color in the marine environment. Quantitative retrieval from satellite remote sensing is a main way to obtain large-scale oceanic Chl-a. However, missing data are a major limitation in satellite remote-sensing-based Chl-a products due mostly to the influence of cloud, sun glint contamination, and high satellite viewing angles. The common methods to reconstruct (gap fill) missing data often consider spatiotemporal information of initial images alone, such as Data Interpolating Empirical Orthogonal Functions, optimal interpolation, Kriging interpolation, and the extended Kalman filter. However, these methods do not perform well in the presence of large-scale missing values in the image and overlook the valuable information available from other datasets for data reconstruction. Here, we developed a convolutional neural network (CNN) named Ocean Chlorophyll-a concentration reconstruction by convolutional neural NETwork (OCNET) for Chl-a concentration data reconstruction in open-ocean areas, considering environmental variables that are associated with ocean phytoplankton growth and distribution. Sea surface temperature (SST), salinity (SAL), photosynthetically active radiation (PAR), and sea surface pressure (SSP) from reanalysis data and satellite observations were selected as the input of OCNET to correlate with the environment and phytoplankton biomass. The developed OCNET model achieves good performance in the reconstruction of global open ocean Chl-a concentration data and captures spatiotemporal variations of these features. The reconstructed Chl-a data are available online at https://doi.org/10.5281/zenodo.10011908. This study also shows the potential of machine learning in large-scale ocean color data reconstruction and offers the possibility of predicting Chl-a concentration trends in a changing environment.

How to cite: Hong, Z., Long, D., Li, X., Wang, Y., Zhang, J., Hamouda, M., and Mohamed, M.: A global daily gap-filled chlorophyll-a dataset in open oceans during 2001–2021 from multisource information using convolutional neural networks, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-16166, https://doi.org/10.5194/egusphere-egu24-16166, 2024.

11:34–11:44

EGU24-17465

On-site presentation

Deep Sea Surface Height Multivariate Interpolation

Théo Archambault, Pierre Garcia, Anastase Alexandre Charantonis, and Dominique Béréziat

The Sea Surface Height (SSH) is an important variable of the ocean state. It is currently estimated by satellites measuring the return time of a radar pulse. Due to this remote sensing technology, nadir-pointing altimeters take measures vertically, only along their ground tracks. Recovering fully gridded SSH fields involves a challenging spatiotemporal interpolation. The most widely used operational product, the Data Unification and Altimeter Combination System (DUACS), combines data from several satellites through linear optimal interpolation to estimate the SSH field. However several studies demonstrate that DUACS does not resolve mesoscale structures, motivating our interest in improving interpolation methods. Recently, Deep Learning has emerged as one of the leading methods to solve ill-posed inverse imaging problems. Deep Neural Networks can use multi-variate information to constrain the interpolation. Among them, Sea Surface Temperature (SST) data is based on a different remote-sensing technology, which leads to higher data coverage and resolution. Deep Learning methods have been proposed to interpolate SSH from track measurements, efficiently using SST contextual information. However, training neural networks usually requires either a realistic simulation of the problem on which we have access to SSH ground truth or a loss function that does not require it. Both solutions present limitations: the first is likely to suffer from domain gap issues once applied to real-world data, and training on observations only leads to lower performance than supervision on complete fields. We propose a hybrid method: a supervised pretraining on a realistic simulation, and an unsupervised fine-tuning on real-world observations. This approach was performed using a deep Attention-based Encoder-Decoder architecture. We compare the performances of the same neural network architecture trained in the three described settings: simulation-based training, observation-based training, and our hybrid approach. Preliminary results show an improvement of approximately 25% over DUACS in the interpolation task on the Ocean Data Challenge 2021 dataset. We further explore the ability of the architecture proposed to produce near real-time forecasts of SSH.

How to cite: Archambault, T., Garcia, P., Charantonis, A. A., and Béréziat, D.: Deep Sea Surface Height Multivariate Interpolation, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-17465, https://doi.org/10.5194/egusphere-egu24-17465, 2024.

11:44–11:54

EGU24-3954

On-site presentation

Collocation of multi-source satellite imagery for ship detection based on Deep Learning models

Tran-Vu La, Minh-Tan Pham, and Marco Chini

The development of the world economy in recent years has been accompanied by a significant increase in maritime traffic. Accordingly, numerous ship collision incidents, especially in dense maritime traffic zones, have been reported with damage, including oil spills, transportation interruption, etc. To improve maritime surveillance and minimize incidents over the seas, satellite imagery provided by synthetic aperture radar (SAR) and optical sensors has become one of the most effective and economical solutions in recent years. Indeed, both SAR and optical images can be used to detect vessels of different sizes and categories, thanks to their high spatial resolutions and wide swath.

To process a mass of satellite data, Deep Learning (DL) has become an indispensable solution to detect ships with a high accuracy rate. However, the DL models require time and effort for implementation, especially for training, validating, and testing with big datasets. This issue is more significant if we use different satellite imagery datasets for ship detection because data preparation tasks will be multiplied. Therefore, this paper aims to investigate various approaches for applying the DL models trained and tested on different datasets with various spatial resolution and radiometric features. Concretely, we focus on two aspects of ship detection from multi-source satellite imagery that have not been attentively discussed in the literature. First, we compare the performance of DL models trained on one HR or MR dataset and those trained on the combined HR and MR datasets. Second, we compare the performance of DL models trained on an optical or SAR dataset and tested on another. Likewise, we evaluate the performance of DL models trained on the combined SAR and optical dataset. The objective of this work is to answer a practical question of ship detection in maritime surveillance, especially for emergency cases if we can directly apply the DL models trained on one dataset to others having differences in spatial resolution and radiometric features without the supplementary steps such as data preparation and DL models retraining.

When dealing with a limited number of training images, the performance of DL models via the approaches proposed in this study was satisfactory. They could improve 5–20% of average precision, depending on the optical images tested. Likewise, DL models trained on the combined optical and radar dataset could be applied to both optical and radar images. Our experiments showed that the models trained on an optical dataset could be used for radar images, while those trained on a radar dataset offered very poor scores when applied to optical images.

How to cite: La, T.-V., Pham, M.-T., and Chini, M.: Collocation of multi-source satellite imagery for ship detection based on Deep Learning models, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-3954, https://doi.org/10.5194/egusphere-egu24-3954, 2024.

11:54–12:03

Q&A

EMULATORS

12:03–12:13

EGU24-19104

Highlight

On-site presentation

Fast data-driven reduced order models for emulating physics-based flexible mesh coastal-ocean models

Jesper Sandvig Mariegaard, Emil Siim Larsen, and Allan Peter Engsig-Karup

Physics-based coastal ocean models provide vital insights into local and regional coastal dynamics but require significant computational resources to solve numerically. In this work, we develop data-driven reduced order models (ROMs) using machine learning techniques to emulate a 2D flexible mesh hydrodynamic model of Øresund, the Straight between Denmark and Sweden, achieving orders of magnitude speedup while retaining good accuracy. This Øresund model has complex spatio-temporal dynamics driven by time-varying boundary conditions. Two different approaches to generate ROMs offline are developed and compared. Our objective is to assess the advantage of generating such models offline to enable real-time analysis in the online setting.

The first approach extracts patterns in space and time using principal component analysis and learn mappings from previous states and boundary conditions to future states using gradient boosting. The second approach employs Dynamic Mode Decomposition with control (DMDc) to account for boundary forcing. The reduced models are trained offline on a part of the available 12 months of 30-minute resolution snapshots of surface elevation, and u- and v-components of the depth-averaged currents. In both cases a very low number O(100) of latent space dimensions are necessary to get accurate results at the order of 2-4 cm RMSE compared to the full high-fidelity model.

The emulators provide state estimates online in seconds rather than hours, enabling new applications like uncertainty quantification, data assimilation and parameter optimization that require fast model evaluations. Further developments could look to condition the ROMs on a wider range of potential boundary forcings for scenario exploration. This demonstrates machine learning's potential for accelerating coastal simulations for real-time decision support and planning systems facing long-term change and uncertainty.

How to cite: Mariegaard, J. S., Larsen, E. S., and Engsig-Karup, A. P.: Fast data-driven reduced order models for emulating physics-based flexible mesh coastal-ocean models , EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-19104, https://doi.org/10.5194/egusphere-egu24-19104, 2024.

12:13–12:23

EGU24-4488

On-site presentation

Surface current prediction in the seas around the Korean peninsula using a CNN-based deep-learning model

Jae-Hun Park, Jeong-Yeob Chae, and Young Taeg Kim

Prediction of sea surface current is essential for various marine activities, such as tourist industry, commercial transportation, fishing industries, search and rescue operations, and so on. Numerical forecast models make it possible to predict a realistic ocean with the help of data-assimilation and fine spatial resolution. Nevertheless, complicated numerical prediction model requires heavy power and time for computation, which initiated development of novel approaches with efficient computational costs. In that sense, artificial neural networks could be one of the solutions because they need low computational power for prediction thanks to pre-trained networks. Here, we present a prediction framework applicable to the surface current prediction in the seas around the Korean peninsula using three-dimensional (3-D) convolutional neural networks. The network is based on the 3-D U-net structure and modified to predict sea surface currents using oceanic and atmospheric variables. In the forecast procedure, it is optimized to minimize the error of the next day’s sea surface current field and its recursively predicting structure allows more days to be predicted. The network’s performance is evaluated by changing input days and variables to find the optimal surface-current-prediction artificial neural network model, which demonstrates its strong potential for practical uses near future.

How to cite: Park, J.-H., Chae, J.-Y., and Kim, Y. T.: Surface current prediction in the seas around the Korean peninsula using a CNN-based deep-learning model , EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-4488, https://doi.org/10.5194/egusphere-egu24-4488, 2024.

12:23–12:30

Q&A

Posters on site: Thu, 18 Apr, 16:15–18:00 | Hall X5

Display time: Thu, 18 Apr, 14:00–18:00

Chairpersons: Aida Alvera-Azcárate, Rachel Furner, Redouane Lguensat

X5.236

EGU24-2297

ECS

Parameterizing ocean vertical mixing using deep learning trained from high-resolution simulations

Rin Irie, Helen Stewart, Tsuneko Kura, Masaki Hisada, and Takaharu Yaguchi

Ocean vertical mixing plays a fundamental role in phenomena such as upwelling of nutrient-rich deep waters, and is crucial for determining net primary productivity in the ocean [1]. Simulating vertical mixing requires careful consideration and ingenuity for stable execution, as vertical mixing is often turbulent. Direct Numerical Simulations, in which the Navier-Stokes equations are solved without a turbulence model, are not realistic due to the enormous computational complexity. Ocean General Circulation Models (OGCMs) have low resolution and cannot directly resolve small-scale turbulence such as vertical mixing. Consequently, OGCMs based on the Reynolds Averaged Navier-Stokes equations use turbulence parameterizations to model the effect of unresolved motions on the mean flow [2]. Although K-Profile Parameterization (KPP) is widely recognized as a method for parameterizing vertical mixing [3], recent advancements in machine learning have triggered active exploration of data-driven approaches to parameterization [4, 5]. This study aims to develop a novel vertical mixing parameterization method using deep learning. High-resolution simulation results (O(10³) m) are used as training data for a neural network to estimate vertical diffusion and viscosity. These estimates are then used to parameterize fine-scale dynamics in a low-resolution simulation (O(10⁴) m).

The input parameters of the neural network are the state variables R_L = (v_L, θ_L, S_L)^T, where v_L is the flow velocity field, θ_L is the potential temperature, and S_L is the salinity. Here, the L and H subscripts will be used to indicate the low and high-resolution simulations. The output parameters are P = (κ_h, A_h)^T, where κ_h and A_h are the estimated vertical diffusion and viscosities respectively. The loss function is defined as the mean squared error between the state variables of the high and low-resolution simulations R_L−R_H. Verification experiments for the proposed parameterization method are conducted for an idealized double-gyre configuration, which models western boundary currents such as the Gulf Stream in the North Atlantic Ocean. We confirm the performance and efficiency of the proposed method compared to traditional KPP for conducting high-resolution simulations at low computational cost.

Acknowledgements
This work used computational resources of supercomputer Fugaku provided by the RIKEN Center for Computational Science through the HPCI System Research Project (Project ID: hp230382).

References
[1] D. Couespel et. al (2021), Oceanic primary production decline halved in eddy-resolving simulations of global warming, Biogeosciences, 18(14), 4321-4349.
[2] M. Solano, and Y. Fan (2022), A new K-profile parameterization for the ocean surface boundary layer under realistic forcing conditions, Ocean Modelling, 171, 101958.
[3] W. G. Large et. al (1994), Oceanic vertical mixing: A review and a model with a nonlocal boundary layer parameterization, Reviews of geophysics, 32(4), 363–403.
[4] Y. Han et. al (2020), A moist physics parameterization based on deep learning, Journal of Advances in Modeling Earth Systems, 12(9), e2020MS002076.
[5] Y. Zhu et. al (2022), Physics-informed deep-learning parameterization of ocean vertical mixing improves climate simulations, National Science Review, 9(8), nwac044.

How to cite: Irie, R., Stewart, H., Kura, T., Hisada, M., and Yaguchi, T.: Parameterizing ocean vertical mixing using deep learning trained from high-resolution simulations, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-2297, https://doi.org/10.5194/egusphere-egu24-2297, 2024.

X5.237

EGU24-18857

ECS

A two-phase Neural Model for CMIP6 bias correction

Abhishek Pasula and Deepak Subramani

The Coupled Model Intercomparison Project, now in its sixth phase (CMIP6), is a global effort to project future climate scenarios on following certain shared socioeconomic pathways (SSP). For the period 1950-2014, CMIP6 provides a historical model output. From 2015 future projections with four different SSP scenarios, viz. SSP126, 245, 370 and 585 are available. From 2015-2023, we also have reanalysis of the actual ocean and atmosphere variables in these years. From this data, it is observed that CMIP6 future projections of ocean variables have a root mean square error (RMSE) of 1.22 psu in sea surface salinity, 1.24 °C in sea surface temperature, 2.23 m/s in the zonal ocean velocity component, 1.34 m/s in the meridional ocean velocity component. Similarly, the atmospheric variables have a RMSE of 1.34 °C in temperature at 2-meter height, 2.12 m/s in the zonal, and 1.321 m/s meridional wind component. Our goal is to develop an accurate method to correct this bias and provide updated future projections for scientific analysis. To this end, we developed a two phase deep neural network model that accepts monthly fields from the CMIP6 projections (all four SSP scenarios), and outputs a bias corrected field. In the first phase, a deep neural model, which we call as Atmospheric-Ocean Network 1 (AONet1) is used to obtain bias corrected fields for each of the four SSP separately. The AONet1 is trained and validated using the historical CMIP6 data (1950-2014) as input and ORAS5 and ERA5 data as the output (the bias corrected field). In the second phase, the four bias-corrected SSP fields are fed to AONet2 and the final bias corrected single field is produced. The AONet2 is trained and validated using future projection data from 2015-2021 as input and ORAS5 and ERA5 from the same period as output. The testing of the two phase model is performed for years 2022 and 2023, before bias corrected future fields are produced. Results are compared to the statistical EDCDF method using different Image Quality Assessment metrics such as Data structural similarity index measure (DSSIM), Multi-Scale SSIM, and Visual information fidelity. On test data, the RMSE after bias reduction using the two phase AONet model is 40% lower. Image assessment metric values surpassed the EDCDF approach as well.

How to cite: Pasula, A. and Subramani, D.: A two-phase Neural Model for CMIP6 bias correction, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-18857, https://doi.org/10.5194/egusphere-egu24-18857, 2024.

X5.238

EGU24-880

ECS

Intercomparison of high resolution ocean reanalysis products with observations, for exploring the spatiotemporal characteristics in the Indian Ocean

(withdrawn after no-show)

Avinash Paul, Maheswaran Padinjaratte Ayyapan, and Satheesan Karathazhiyath

X5.239

EGU24-2934

Accelerating Marine UAV Drone Image Analysis with Sliced Detection and Clustering (MBARI SDCAT)

Duane R. Edgington, Danelle E. Cline, Thomas O'Reilly, Steven H.D. Haddock, John Phillip Ryan, Bryan Touryan-Schaefer, William J. Kirkwood, Paul R. McGill, and Rob S. McEwen

Uncrewed Aerial Vehicles (UAVs) can be a cost-effective solution for capturing a comprehensive view of surface ocean phenomena to study marine population dynamics and ecology. UAVs have several advantages, such as quick deployment from shore, low operational costs, and the ability to be equipped with various sensors, including visual imaging systems and thermal imaging sensors. However, analyzing high-resolution images captured from UAVs can be challenging and time-consuming, especially when identifying small objects or anomalies. Therefore, we developed a method to quickly identify a diverse range of targets in UAV images.

We will discuss our workflow for accelerating the analysis of high-resolution visual images captured from a Trinity F90+ Vertical Take-Off and Landing (VTOL) drone in near-shore habitats around the Monterey Bay region in California at approximately 60 meters altitude. Our approach uses a state-of-the-art self-distillation with knowledge (DINO) transformer foundation model and multi-scale, sliced object detection (SAHI) methods to locate a wide range of objects, from small to large, such as schools or individual jellyfish, flocks of birds, kelp forests or kelp fragments, small debris, occasional cetaceans, and pinnipeds. To make the data analysis more efficient, we create clusters of similar objects based on visual similarity, which can be quickly examined through a web-based interface. This approach eliminates the need for previously labeled objects to train a model, optimizing limited human resources. Our work demonstrates the useful application of state-of-the-art techniques to assist in the rapid analysis of images and how this can be used to develop a recognition system based upon machine-learning for the rapid detection and classification of UAV images. All of our work is freely available as open-source code.

How to cite: Edgington, D. R., Cline, D. E., O'Reilly, T., Haddock, S. H. D., Ryan, J. P., Touryan-Schaefer, B., Kirkwood, W. J., McGill, P. R., and McEwen, R. S.: Accelerating Marine UAV Drone Image Analysis with Sliced Detection and Clustering (MBARI SDCAT), EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-2934, https://doi.org/10.5194/egusphere-egu24-2934, 2024.

X5.240

EGU24-21267

ECS

Seabed substrate mapping based on machine learning using MBES data

(withdrawn)

Jeong Min Seo, Sanghun Son, Jaegu Bae, Doi Lee, So Ryeon Park, and Jinsoo Kim

X5.241

EGU24-18759

Detection and identification of environmental faunal proxies in digital images and video footage from northern Norwegian fjords and coastal waters using deep learning object detection algorithms

Steffen Aagaard Sørensen, Eirik Myrvoll-Nielsen, Iver Martinsen, Fred Godtliebsen, Stamatia Galata, Juho Junttila, and Tone Vassdal

The ICT+ project:” Transforming ocean surveying by the power of DL and statistical methods” hosted by UiT The Artic University of Norway, aims at employing machine learning techniques in improving and streamlining methods currently used in ocean surveying by private sector partners to the project, MultiConsult and Argeo. The tasks include detection and identification of µm (e.g. foraminifera, microplastics) to m (e.g. boulders, shipwrecks) sized objects and elements at and in the seabed in data that presently is processed manually by skilled workers, but ideally could be wholly or partially processed using an automated approach.

Here we present preliminary work and results related to application of the YOLO (You Only Look Once) algorithms in detection and identification of meio fauna (foraminifera) in - and macro (mollusc) fauna at the seabed. Both proxies are used in evaluation of the environmental state of the seabed. YOLO is a real-time object detection deep learning algorithm that efficiently identifies and locates objects in images or videos in a single pass through the neural network.

Presently the year on year growth or shrinkage of protected mollusc banks in northern Norwegian fjords is manually evaluated via video observation in seabed video sequences annually captured via remotely operated vehicles. The preliminary results suggest that upon moderate training the YOLO algorithm can identify presence/absence of mollusc bank formations in set video sequences, thus supporting and eventually minimizing the task of inspecting the video footage manually.

Foraminifera are abundant marine meiofauna living in the water column or at and in the seabed. Foraminifera are utilized in research into both modern and past marine environments as they have high turnover rates and individual shells have high preservation potential. Foraminiferal shells are accumulated in the sediments and after sample processing, they subsequently can be manually detected and identified via microscope. This work is very labour-intensive and demands skilled expertise but suffers from errors by and bias of the individual expert.

Preliminary results show that a YOLO network, trained on ca 4100 individuals (20 subgroups; benthic calcareous foraminifera (n=19), Planktic foraminifera (n=1)) in 346 images have model performances of up to 0.96 mAP (mean average precision) when trained, validated and tested on the training set. These promising results will be tested on real world samples. This testing is complicated by real world samples containing many more foraminiferal species/groups that were not trained upon, overlapping or closely set specimens and presence of non-foraminiferal material (e.g. sediment grains, other meio-fauna or –flora, etc.). Thus, additional training with focus on set complicating aspects will likely be necessary and most recent result will be presented.

How to cite: Aagaard Sørensen, S., Myrvoll-Nielsen, E., Martinsen, I., Godtliebsen, F., Galata, S., Junttila, J., and Vassdal, T.: Detection and identification of environmental faunal proxies in digital images and video footage from northern Norwegian fjords and coastal waters using deep learning object detection algorithms, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-18759, https://doi.org/10.5194/egusphere-egu24-18759, 2024.

X5.242

EGU24-5552

ECS

Scalable 3D Semantic Mapping of Coral Reefs with Deep Learning

Jonathan Sauder, Guilhem Banc-Prandi, Gabriela Perna, Anders Meibom, and Devis Tuia

Coral reefs, which host more than a third of the ocean’s biodiversity on less than 0.1% of its surface, are existentially threatened by climate change and other human activities. This necessitates methods for evaluating the state of coral reefs that are efficient, scalable, and low-cost. Current digital reef monitoring tools typically rely on conventional Structure-from-Motion photogrammetry, which can limit the scalability, and current datasets for training semantic segmentation systems are either sparsely labeled, domain-specific, or very small. We describe the first deep-learning-based 3D semantic mapping approach, which enables rapid mapping of coral reef transects by leveraging the synergy between self-supervised deep learning SLAM systems and neural network-based semantic segmentation, even when using low-cost underwater cameras. The 3D mapping component learns to tackle the challenging lighting effects of underwater environments from a large dataset of reef videos. The transnational data-collection initiative was carried out in Djibouti, Sudan, Jordan, and Israel, with over 150 hours of collected video footage for training the neural network for 3D reconstruction. The semantic segmentation component is a neural network trained on a dataset of video frames with over 80’000 annotated polygons from 36 benthic classes, down to the resolution of prominent visually identifiable genera found in the shallow reefs of the Red Sea. This research paves the way for affordable and widespread deployment of the method in analysis of video transects in conservation and ecology, highlighting a promising intersection with machine learning for tangible impact in understanding these oceanic ecosystems.

How to cite: Sauder, J., Banc-Prandi, G., Perna, G., Meibom, A., and Tuia, D.: Scalable 3D Semantic Mapping of Coral Reefs with Deep Learning, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-5552, https://doi.org/10.5194/egusphere-egu24-5552, 2024.

X5.243

EGU24-21554

ECS

A deep learning pipeline for automatic microfossil analysis and classification

Iver Martinsen, David Wade, Benjamin Ricaud, and Fred Godtliebsen

X5.244

EGU24-3523

PETRI-MED: Advancing Satellite-Based Monitoring for Microbial Plankton Biodiversity in the Mediterranean Sea

Tinkara Tinta and the PETRI-MED

The assessment and monitoring of microbial plankton biodiversity are essential to obtain a robust evaluation of the health status of marine environments. The PETRI-MED project addresses this imperative by developing novel strategies to monitor the microbial plankton community composition and function, based on satellite observations. PETRI-MED will focus on the Mediterranean Sea as a global biodiversity hotspot with profound ecological and cultural importance. The primary objectives of PETRI-MED project encompass (i) the development of innovative satellite-based indicators to determine the biodiversity status and trends of microbial plankton community, (ii) the identification of spatio-temporal patterns in microbial plankton distribution and diversity, and (iii) the elucidation of key controls of biodiversity patterns, including ecological connectivity, natural and human-related forcings, by focusing on key indicators of ocean’s health and/or biogeochemical state. To do so, PETRI-MED will largely rely on satellite optical radiometric measurements (i.e, Ocean Colour, OC), exploiting the combined temporal and spatial characteristics of latest OC European datasets (i.e., Copernicus Sentinel-3 and European Space Agency OC-CCI) with state-of-the-art remote sensing observations and biogeochemical models (as provided by Copernicus Marine), marine currents modelling, and genomic techniques. To achieve the ambitious goal of merging remote sensing, biogeochemical/physical modelling, and in situ omics measurements, PETRI-MED will rely on Artificial Intelligence (AI). The overarching goal of PETRI-MED is to empower policymakers and stakeholders with the necessary knowledge to adopt prioritization approaches for ecosystem management based on quantitative, real-time metrics. This includes the design and implementation of protection strategies and policies to safeguard biodiversity, quantifying the impact of implemented actions at various levels, and enabling systematic, fact-supported management of Marine Protected Areas (MPAs), Key Biodiversity Areas, and Ecologically or Biologically Significant Marine Areas. Furthermore, PETRI-MED seeks to evaluate the viability of MPA management in response to climate change, ensuring adaptive strategies for the conservation of marine ecosystems in the face of environmental challenges. In summary, PETRI-MED represents a comprehensive and innovative approach to advancing our understanding of microbial plankton biodiversity in the Mediterranean Sea. Through the integration of satellite technology, omics techniques and AI, the project contributes valuable insights and tools for effective marine ecosystem management and conservation strategies.

How to cite: Tinta, T. and the PETRI-MED: PETRI-MED: Advancing Satellite-Based Monitoring for Microbial Plankton Biodiversity in the Mediterranean Sea, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-3523, https://doi.org/10.5194/egusphere-egu24-3523, 2024.

X5.245

EGU24-12271

Harnessing Machine Learning and Principal Components Techniques for Atmospheric and Glint Correction to Retrieve Ocean Color from Geostationary Satellites

Zachary Fasnacht, Joanna Joiner, Matthew Bandel, David Haffner, Alexander Vassilkov, Patricia Castellanos, and Nickolay Krotkov

Retrievals of ocean color (OC) properties from space are important for better understanding the ocean ecosystem and carbon cycle. The launch of atmospheric hyperspectral instruments such as the geostationary Tropospheric Emissions: Monitoring of Pollution (TEMPO) and GEMS, provide a unique opportunity to examine the diurnal variability in ocean ecology across various waters in North America and prepare for the future suite of hyperspectral OC sensors. While TEMPO does not have as high spatial resolution or full spectral coverage as planned coastal ocean sensors such as the Geosynchronous Littoral Imaging and Monitoring Radiometer (GLIMR) or GeoXO OC instrument (OCX), it provides hourly coverage of US coastal regions and great lakes, such as Lake Erie and the Gulf of Mexico at spatial scales of approximately 5 km. We will apply our newly developed machine learning (ML) based atmospheric correction approach for OC retrievals to TEMPO data. Our approach begins by decomposing measured hyperspectral radiances into spectral features that explain the variability in atmospheric scattering and absorption as well as the underlying surface reflectance. The coefficients of the principal components are then used to train a neural network to predict OC properties such as chlorophyll concentration derived from collocated MODIS/VIIRS physically-based retrievals. This ML approach compliments the standard radiative transfer-based OC retrievals by providing gap-filling over cloudy regions where the standard algorithms are limited. Previously, we applied our approach using blue and UV wavelengths with the Ozone Monitoring Instrument (OMI) and TROPOspheric Monitoring Instrument (TROPOMI) to show that it can estimate OC properties in less-than-ideal conditions such as lightly to moderately cloudy conditions as well as sun glint and thus improve the spatial coverage of ocean color measurements. TEMPO provides an opportunity to improve on this approach since it provides extended spectral measurements at green and red wavelengths which are important particularly for coastal waters. Additionally, our ML technique can be applied to provisional data early in the mission and has potential to demonstrate the value of near real time OC products that are important for monitoring of harmful algae blooms and transient oceanic phenomena.

How to cite: Fasnacht, Z., Joiner, J., Bandel, M., Haffner, D., Vassilkov, A., Castellanos, P., and Krotkov, N.: Harnessing Machine Learning and Principal Components Techniques for Atmospheric and Glint Correction to Retrieve Ocean Color from Geostationary Satellites, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-12271, https://doi.org/10.5194/egusphere-egu24-12271, 2024.

X5.246

EGU24-18688

Spatial Generalization of 4DVarNet in ocean colour Remote Sensing

Clément Dorffer, Thi Thuy Nga Nguyen, Fréderic Jourdin, and Ronan Fablet

4DVarNet algorithm is an AI based variational approach that performs spatiotemporal time-series interpolation. It has been used with success on Ocean Color satellite images to fill in the blank of missing data due to e.g., the satellites trajectories or the clouds covering. 4DVarNet has shown impressive interpolation performances compare to other classical approaches such as DInEOF.
We propose to show that 4DVarNet is a flexible model that learns global dynamics instead of local patterns, thus enabling it to interpolate different type of data, i.e., data from different spatio-temporal domain and/or representing different variables, using the same pre-trained model.

The core of our technique involves extrapolating the learned models to other, somewhat larger geographical areas, including the entire Mediterranean and other regions like the North Sea. We achieve this by segmenting larger areas into smaller and manageable sections, and then choosing a section to train the model. Finally the trained model is applied to each segment and seamlessly integrating the prediction results. This method ensures detailed and accurate coverage over extensive areas, significantly enhancing the predictive power of our models while maintaining low computational costs.

Our results demonstrate that this approach not only outperforms traditional methods in terms of accuracy but also provides a scalable solution, adaptable to various geographical contexts. By leveraging localized training and strategic extrapolation, we offer a robust framework for ocean monitoring, paving the way for advanced satellite image applications in diverse settings.

How to cite: Dorffer, C., Nguyen, T. T. N., Jourdin, F., and Fablet, R.: Spatial Generalization of 4DVarNet in ocean colour Remote Sensing, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-18688, https://doi.org/10.5194/egusphere-egu24-18688, 2024.

X5.247

EGU24-15508

Data-driven short-term forecast of suspended inorganic matter as seen by ocean colour remote sensing.

Jean-Marie Vient, Frédéric Jourdin, Ronan Fablet, and Christophe Delacourt

Short-term forecasting (several days in advance) of underwater visibility range is needed for marine and maritime operations involving divers or optical sensors, as well as for recreational activities such as scuba diving (e.g. Chang et al 2013). Underwater visibility mainly depends on water turbidity, which is caused by small suspended particles of organic and mineral origin (Preisendorfer 1986). Modelling the fate of these particles can be complex, encouraging the development of machine learning methods based on satellite data and hydrodynamic simulations (e.g. Jourdin et al 2020). In the field of forecasting visibility, deep learning methods are emerging (Prypeshniuk 2023). Here, in continuation of Vient et al (2022) on the interpolation purpose, this work deals with forecasting subsurface mineral turbidity levels over the French continental shelf of the Bay of Biscay using the deep learning method entitled 4DVarNet (Fablet et al 2021) applied to ocean colour satellite data, with additional data such as bathymetry (ocean depths) and time series of main forcing statistical parameters like wave significant heights and tidal coefficients. Using satellite data alone, results show that 2-day forecasts are accurate enough. When adding bathymetry and forcing parameters in the process, forecasts can go up to 6 days in advance.

References

Chang, G., Jones, C., and Twardowski, M. (2013), Prediction of optical variability in dynamic nearshore environments, Methods in Oceanography, 7, 63-78, https://doi.org/10.1016/j.mio.2013.12.002

Fablet, R., Chapron, B., Drumetz, L., Mémin, E., Pannekoucke, O., and Rousseau, F. (2021), Learning variational data assimilation models and solvers, Journal of Advances in Modeling Earth Systems, 13, e2021MS002572, https://doi.org/10.1029/2021MS002572

Jourdin, F., Renosh, P.R., Charantonis, A.A., Guillou, N., Thiria, S., Badran, F. and Garlan, T. (2021), An Observing System Simulation Experiment (OSSE) in Deriving Suspended Sediment Concentrations in the Ocean From MTG/FCI Satellite Sensor, IEEE Transactions on Geoscience and Remote Sensing, 59(7), 5423-5433, https://doi.org/10.1109/TGRS.2020.3011742

Preisendorfer, R. W. (1986), Secchi disk science: Visual optics of natural waters, Limnology and Oceanography, 31(5), 909-926, https://doi.org/10.4319/lo.1986.31.5.0909

Prypeshniuk, V. (2023), Ocean surface visibility prediction, Master thesis, Ukrainian Catholic University, Faculty of Applied Sciences, Department of Computer Sciences, Lviv, Ukraine, 39 pp, https://er.ucu.edu.ua/handle/1/3948?locale-attribute=en

Vient, J.-M., Fablet, R.;, Jourdin, F. and Delacourt, C. (2022), End-to-End Neural Interpolation of Satellite-Derived Sea Surface Suspended Sediment Concentrations, Remote Sens., 14(16), 4024, https://doi.org/10.3390/rs14164024

How to cite: Vient, J.-M., Jourdin, F., Fablet, R., and Delacourt, C.: Data-driven short-term forecast of suspended inorganic matter as seen by ocean colour remote sensing., EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-15508, https://doi.org/10.5194/egusphere-egu24-15508, 2024.

X5.248

EGU24-19157

ECS

Estimating global POC fluxes using ML and data fusion on heterogeneous and sparse in situ observations

Abhiraami Navaneethanathan, Bb Cael, Chunbo Luo, Peter Challenor, Adrian Martin, and Sabina Leonelli

The ocean biological carbon pump, a significant set of processes in the global carbon cycle, drives the sinking of particulate organic carbon (POC) towards the deep ocean. Global estimates of POC fluxes and an improved understanding of how environmental factors influence organic ocean carbon transport can help quantify how much carbon is sequestered in the ocean and how this can change in different environmental conditions, in addition to improving global carbon and marine ecosystem models. POC fluxes can be derived from observations taken by a variety of in situ instruments such as sediment traps, 234-Thorium tracers and Underwater Vision Profilers. However, the manual and time-consuming nature of data collection leads to limitations of spatial data sparsity on a global scale, resulting in large estimate uncertainties in under-sampled regions.

This research takes an observation-driven approach with machine learning and statistical models trained to estimate POC fluxes on a global scale using the in situ observations and well-sampled environmental driver datasets, such as temperature and nutrient concentrations. This approach holds two main benefits: 1) the ability to fill observational gaps on both a spatial and temporal scale and 2) the opportunity to interpret the importance of each environmental factor for estimating POC fluxes, and therefore exposing their relationship to organic carbon transport processes. The models built include random forests, neural networks and Bayesian hierarchical models, where their global POC flux estimates, feature importance and model performances are studied and compared. Additionally, this research explores the use of data fusion methods to combine all three heterogeneous in situ POC flux data sources to achieve improved accuracy and better-informed inferences about organic carbon transport than what is possible using a single data source. By treating the heterogeneous data sources differently, accounting for their biases, and introducing domain knowledge into the models, our data fusion method can not only harness the information from all three data sources, but also gives insights into their key differences.

How to cite: Navaneethanathan, A., Cael, B., Luo, C., Challenor, P., Martin, A., and Leonelli, S.: Estimating global POC fluxes using ML and data fusion on heterogeneous and sparse in situ observations, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-19157, https://doi.org/10.5194/egusphere-egu24-19157, 2024.

X5.249

EGU24-18627

Prediction of sill fjord basin water renewals and oxygen levels

João Bettencourt

The water in the basin of sill fjords is renewed occasionally. In some fjords, this renewal occurs irregularly while in others it has a more regular character. Independently of the renewal period, the renewal mechanism is thought to be common to all sill fjords: subsurface water outside of the fjord mouth lifted above the sill depth will trigger a renewal, provided that the lifted water mass is denser than the water in the basin. In Western Norway, the northerly, upwelling favorable winds that occur during Spring/Summer, provide a forcing for the uplifting of the isopycnals and bring dense, subsurface water to the upper water column, thereby creating the conditions for renewals to occur. The renewal of sill fjord basins is an important aspect of the fjord ecological condition because it supplies oxygen rich water to the fjord basin, whose oxygen is consumed by the degradation of organic matter during the stagnant periods. Byfjorden is the urban fjord in Bergen, Norway. It is heavily urbanized and has been consistently showing lower oxygen levels in its basin, which has ecological implications.

Byfjorden’s basin water is regularly renewed between the months of March and August and a strong link to coastal and atmospheric variability is well known, which makes it an attractive choice for the application of Deep Learning to predict basin water renewal in sill fjords, in the context of the atmospheric and hydrographic setting of the Norwegian coast.

In this work, the prediction of deep water renewal in Byfjorden and the basin’s oxygen levels is investigated with deep learning techniques. After a statistical study of oxygen variability correlation with wind forcing along the Norwegian coast, we develop and test a model to predict renewals and fill gaps in Byfjorden’s oxygenatio record.

How to cite: Bettencourt, J.: Prediction of sill fjord basin water renewals and oxygen levels, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-18627, https://doi.org/10.5194/egusphere-egu24-18627, 2024.

X5.250

EGU24-5926

ECS

Reconstructing Global Ocean Deoxygenation Over a Century with Deep Learning

Bin Lu, Ze Zhao, Luyu Han, Xiaoying Gan, Yuntao Zhou, Lei Zhou, Luoyi Fu, Xinbing Wang, Jing Zhang, and Chenghu Zhou

Oxygen is fundamentally essential for all life. Unfortunately, recent research has shown that global ocean deoxygenation has significantly increased over the past 50 years, and the stock of dissolved oxygen (DO) in the ocean has been continuously decreasing. Breathless ocean has led to large-scale death of fish, seriously affecting the marine ecosystem. Moreover, global warming and human activities have further intensified the expansion of dead zones (low-oxygen area) in the ocean.

Hence, it is of vital importance to quantitatively understand and predict the trend of global ocean deoxygenation. However, despite of the accumulation of in-situ DO observation in recent years, global and long-term observation data is still severely sparse, leading to a critical challenge in reconstructing global ocean deoxygenation over a century. Existing works can be categorized into two ways: (1) Physics-informed numerical models. These methods simulate the DO concentration based on climate models without utilizing in-situ observations, e.g., Coupled Model Intercomparison Project Phase 6 (CMIP6). However, these models fail to adjust biased simulation results based on temporal DO observations and cause error propagation. (2) Spatial interpolation methods. These methods reconstruct the global deoxygenation through available observations by geostatistical regression, Kriging, etc. But these ways are unable to capture the complex spatiotemporal heterogeneity and physical-biogeochemical properties, showing inconsistent performance in different areas.

To this end, we propose a knowledge-infused deep graph learning method called 4D Spatio-Temporal Graph HyperNetwork (4D-STGHN) to reconstruct four-dimensional (including time, latitude, longitude, and depth) global ocean deoxygenation from 1920 till now. To capture the spatio-temporal heterogeneity in different regions, 4D-STGHN utilize hypernetwork to generate non-shared parameters by fusing 4D geographic information and observations. Moreover, we design a chemistry-informed gradient norm mechanism as the loss function by integrating the observation of nitrate and phosphate, hereby further improving the performance of DO reconstruction. 4D-STGHN shows promising reconstruction with mean absolute percentage error (MAPE) of only 5.39%, largely outperforming three CMIP6 experiments (CESM2-omip1, CESM2-omip2 and GFDL-ESM4-historical) on dissolved oxygen and other machine learning methods. Further analysis on the global oxygen minimum zones, as well as regional analysis are conducted to evaluate the effectiveness of our proposed methods.

How to cite: Lu, B., Zhao, Z., Han, L., Gan, X., Zhou, Y., Zhou, L., Fu, L., Wang, X., Zhang, J., and Zhou, C.: Reconstructing Global Ocean Deoxygenation Over a Century with Deep Learning, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-5926, https://doi.org/10.5194/egusphere-egu24-5926, 2024.

X5.251

EGU24-6927

ECS

Highlight

A Deep Learning Model for Tropical Cyclone Center Localization Based on SAR Imagery

Shanshan Mu, Haoyu Wang, and Xiaofeng Li

Tropical cyclones (TCs) are natural disasters originating over tropical or subtropical oceans. Their landfall is generally accompanied by extensive high winds and persistent precipitation, causing severe economic losses and human casualties yearly. Consequently, conducting effective TC landfall intensity forecasts for disaster risk reduction is imperative. The calm center of a TC, known as the TC eye, serves as a vital indicator of its intensity. Hence, precisely locating TC centers is crucial for determining TC intensity. In this study, a deep-learning model was developed to extract TC centers from satellite remote-sensing images automatically.
Space-borne synthetic aperture radar (SAR) imagery plays a critical role in monitoring natural hazards owing to its high spatial resolution, wide coverage, and day-night imaging capabilities. A total of 110 Sentinel SAR images spanning from 2016 to 2019 were used for TC center localization in this paper. They were acquired in interferometric-wide (IW) mode with a pixel size of 10 m and extra-wide (EW) mode with a pixel size of 40 m. They were resampled by spatial averaging to maintain the same pixel size of 80 m. Additionally, we manually annotated the central area of tropical cyclone images as ground truth data.
For the dataset, we initially divided 110 SAR images and the corresponding truth data into training, validation, and testing sets in an 8:1:1 ratio. Subsequently, we partitioned the SAR images into 256 × 256 pixel-sized slices as the model inputs. 32151/4611/3900 samples were extracted as the training/validation/testing dataset. Considering the target samples containing the center position are far less than compared background samples in TCs, we retained all center-containing samples and randomly selected 1.2 times the number of background samples for each image. Consequently, we obtained a final dataset of 2388/257/245 samples for training, validation, and testing.
As is known, deep learning technology excels in learning non-linear relationships and is good at automatically extracting intricate patterns from SAR imagery. The Res-UNet, a U-Net-like model with the weighted attention mechanism and the skip connection scheme that has been proven effective in solving the problem of contrast reduction caused by signal interference, was ultimately determined as the deep learning model for the automatic positioning of tropical cyclone centers in our study.
We calculated the centroid of the central region and compared the model results with ground truth. Our model outputs agreed well with the visually located TC center with a mean intersection over union (IOU) and a mean TC center location error of 0.71/0.70/0.67 and 3.59/2.24/2.20 km for the training/validation/testing dataset. Moreover, our model greatly simplifies the complexity of traditional methods such as using spiral rainbands and background wind fields for center positioning. At the same time, our method can not only obtain the position of the TC center but also extract the central area, thereby obtaining the morphological characteristics of TCs, which is conducive to better monitoring and intensity determination of TC.

How to cite: Mu, S., Wang, H., and Li, X.: A Deep Learning Model for Tropical Cyclone Center Localization Based on SAR Imagery, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-6927, https://doi.org/10.5194/egusphere-egu24-6927, 2024.

X5.252

EGU24-1277

A Catboost-based Model for Intensity Detection of Tropical Cyclones over the Western North Pacific Based on Satellite Cloud Images

Wei Zhong, Hongrang He, Shilin Wang, Yuan Sun, and Yao Yao

A Catboost-based intelligent tropical cyclone (TC) intensity-detecting model is built to quantify the intensity of TCs over the Western North Pacific (WNP) with the cloud-top brightness temperature (CTBT) data of Fengyun-2F (FY-2F) and Fengyun-2G (FY-2G) and the best-track data of the China Meteorological Administration (CMA-BST) in recent years (2015-2018). Catboost-based model is featured with the greedy strategy of combination, the ordering principle in optimizing the possible gradient bias and prediction shift problems, and the oblivious tree in fast scoring. Compared with the previous studies based on the pure convolutional neural network (CNN) models, the Catboost-based model exhibits better skills in detecting TC intensity with the root mean square error (RMSE) of 3.74 m s^-1. Besides of the three mentioned model features, there are also two reasons on model design. On one hand, the Catboost-based model uses the method of introducing prior physical factors (e.g., the structure and shape of the cloud, deep convections and background fields) into its training process, on the other hand, the Catboost-based model expands the dataset size from 2342 to 13471 samples by hourly interpolation of the original dataset. Furthermore, this paper investigates the errors of the model in detecting different categories of TC intensity. The results show that the deep learning-based TC intensity-detecting model proposed in this paper has systematic biases, namely, overestimation (underestimation) of intensities in TC which are weaker (stronger) than typhoon level, and the errors of the model in detecting weaker (stronger) TCs are smaller (larger). This implies that more factors than the CTBT should be included to further reduce the errors in detecting strong TCs.

How to cite: Zhong, W., He, H., Wang, S., Sun, Y., and Yao, Y.: A Catboost-based Model for Intensity Detection of Tropical Cyclones over the Western North Pacific Based on Satellite Cloud Images, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-1277, https://doi.org/10.5194/egusphere-egu24-1277, 2024.

X5.253

EGU24-8207

ECS

Unveiling the Ocean’s Rhythms: Blending Deep Learning and Spectral Analysis Together to Gain Insights into Sunda Shelf Surface Currents using AIS Data

Jun Yu Puah, Ivan D. Haigh, David Lallemant, Ronan Fablet, Kyle Morgan, and Adam D. Switzer

Surface currents influence ship navigation, coastal heat transfer and sediment transport, and thus necessitate robust models that can reliably predict surface current behaviour. However, our ability to make predictions over long time scales are commonly hampered by a lack of long observational datasets. Remote sensing technologies, which include satellite altimetry and high-frequency radar, are often used to measure global surface currents. However, their ability to reveal insights on ocean dynamics at a regional scale remain limited by restrictions related to space-time sampling. Here, we explore the use of AIS data as a means to derive surface currents in the Sunda Shelf Region in equatorial southeast Asia. Firstly, we apply nearest-neighbour interpolation to map relevant AIS information, that includes the ship’s speed over ground, course over ground and heading, onto a grid with a spatial resolution of 100m and an hourly temporal resolution. Next, we applied a gradient descent approach to derive surface currents at the positions of the ships. We then implement a generative model on PyTorch to reconstruct surface currents in the region. The model performance is evaluated by comparing to observational data from drifters and drifting buoys. Lastly, we employed wavelet analysis, a type of nonstationary spectral analysis, to examine the dominant frequencies or periods where surface currents are strong. Our pilot study highlights the potential of AIS data as a credible alternative to traditional methods of measuring surface currents in data scarce areas.

How to cite: Puah, J. Y., Haigh, I. D., Lallemant, D., Fablet, R., Morgan, K., and Switzer, A. D.: Unveiling the Ocean’s Rhythms: Blending Deep Learning and Spectral Analysis Together to Gain Insights into Sunda Shelf Surface Currents using AIS Data, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-8207, https://doi.org/10.5194/egusphere-egu24-8207, 2024.

X5.254

EGU24-15594

Highlight

Conditional Generative Models for OceanBench Sea Surface Height Interpolation

Nils Lehmann, Jonathan Bamber, and Xiaoxiang Zhu

Rising sea levels are one of many consequences of anthropogenic climate
change. Over the past few decades, several global observational records have
become available that give a more detailed picture of the increasing
impacts. Nevertheless, there continue to be data challenges, such as
sparsity or signal to noise ratio, that need to be dealt with. Machine Learning (ML)
and specifically, Deep Learning (DL) approaches have presented themselves as valuable
tools for such large-scale and complex data sources. To this end, the OceanBench
Benchmark suite was recently developed to provide a
standardized pre-processing and evaluation framework for Sea Surface Height
(SSH) interpolation tasks involving nadir and Surface Water and Ocean Topography
(SWAT) Altimetry Tracks. From the methodological perspective, a reoccurring
issue is the lack of uncertainty quantification for DL applications in Earth
Observation. Therefore, we extend the suite of metrics provided by OceanBench
to probabilistic evaluation metrics and test state-of-the-art uncertainty
quantification models from the DL community. Specifically, we focus on
Conditional Convolutional Neural Processes (ConvCNP) and
Inpainting Diffusion models as methodologies to quantify
uncertainty for the interpolation task and demonstrate their viability and
advantages over other ML methods for both accuracy and probabilistic metrics.

How to cite: Lehmann, N., Bamber, J., and Zhu, X.: Conditional Generative Models for OceanBench Sea Surface Height Interpolation, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-15594, https://doi.org/10.5194/egusphere-egu24-15594, 2024.

X5.255

EGU24-17320

ECS

Assessing data assimilation techniques with deep learning-based eddy detection

Issam El Kadiri, Simon Van Gennip, Marie Drevillon, Anass El Aouni, Daria Botvinko, and Ronan Fablet

Mesoscale eddies significantly influence ocean circulation, nutrient distribution, and climate patterns globally. A thorough reconstruction of the eddy field is therefore important, yet classical eddy detection algorithms based on sea level anomaly (SLA) suffer from the low coverage of the current altimetry network.

In this work, we evaluate the efficacy of deep learning techniques in enhancing the oceanic eddy field reconstruction of an operational ocean forecasting system. We use two ocean models from an Observing System Simulation Experiments (OSSE): a free-run high-resolution ocean circulation model representing the ‘truth’ and a second one constrained by synthetic observations mimicking the altimetry network through assimilation techniques to approximate the state of the ’truth’ model.

We train a neural network model that takes sea surface temperature, sea surface height, and ocean surface currents inputs from the data-assimilation model to recover eddies identified in the ‘truth’ model, which are generated with py-eddy-tracker, a sea surface height-based eddy detection algorithm.

Our investigation centers on a semantic segmentation problem using the U- Net architecture to classify pixels for a given map into non-eddy, cyclonic eddy, and anticyclonic eddy. Our study focuses on the Gulf Stream region, an area renowned for its dynamic oceanic conditions. We find a higher detection rate of eddies and reduced inter-class misclassification when compared to eddy fields reconstructed from the data-assimilated model using the traditional SLA-based algorithm.

Our data-driven method improves the detection of ‘true’ eddies from degraded data in an OSSE framework, and shows potential for application in operational analysis and forecasting systems.

How to cite: El Kadiri, I., Van Gennip, S., Drevillon, M., El Aouni, A., Botvinko, D., and Fablet, R.: Assessing data assimilation techniques with deep learning-based eddy detection, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-17320, https://doi.org/10.5194/egusphere-egu24-17320, 2024.

X5.256

EGU24-11061

Machine-learning-based analysis and reconstruction of high-resolution sea-surface temperatures for the North Sea and Baltic Sea

Tim Kruschke, Christopher Kadow, Johannes Meuer, and Claudia Hinrichs

The Federal Maritime and Hydrographic Agency of Germany performs weekly analyses of sea surface temperatures (SST) for the North Sea and Baltic Sea on an operational basis. The analysis is based on in-situ observations and satellite retrievals. Existing procedures require manual quality control and subjective decisions on plausibility of measurements combined with simple interpolation techniques. This study presents ongoing work to develop new procedures based on a machine learning approach, designed to fill in gaps in observational data fields. The employed inpainting technique makes use of a convolutional neural network (CNN) that is trained with complete SST-fields from high-resolution (~3 km) ocean model simulations and masks derived from satellite retrievals to ignore regions covered by clouds on respective days.

First validation efforts for the North Sea based on reconstructing modelled fields that were excluded from training data indicate very promising results, that is an RMSE of ~ 0.5 K or less for most regions of the North Sea. However, areas with high variance such as waters very close to the coast and the Norwegian Channel exhibit larger errors up to 1 K. Additionally, we can show that errors tend to be larger when less observational data are available, e.g. on days with lots of clouds.

It will be tested if optional features of the algorithm may help to improve results in these cases. Especially the possibility to use “memory” of preceding days – potentially featuring less clouds – seems promising in this respect. Furthermore, it will be elaborated if the option of overwriting existing observational data with values better fitting the patterns learned by the CNN is suitable to improve the overall results and hence may be an alternative to external (manual) quality control and plausibility checking.

The final aim of this study is to establish an approach suitable for the operational analysis of daily SSTs with a horizontal resolution of approx. 5 km and the production of an SST reanalysis of North Sea and Baltic Sea SSTs starting in 1990.

How to cite: Kruschke, T., Kadow, C., Meuer, J., and Hinrichs, C.: Machine-learning-based analysis and reconstruction of high-resolution sea-surface temperatures for the North Sea and Baltic Sea, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-11061, https://doi.org/10.5194/egusphere-egu24-11061, 2024.

X5.257

EGU24-17159

Highlight

Exploring Pretrained Transformers for Ocean State Forecasting

Clemens Cremer, Henrik Anderson, and Jesper Mariegaard

Traditional physics-based numerical models have served and are serving as reliable tools to gain insights into spatiotemporal behavior of ocean states such as water levels and currents. However, they have significant computational demand that often translates to slower forecasting capabilities. Additionally, these models can encounter difficulties in capturing certain physical processes and struggle to effectively bridge various spatial and temporal scales.

Considering these challenges, machine learning-based surrogate models emerge as a promising alternative. Physical surrogate models that learn multiple physics (on different spatial and temporal scales) from large datasets during extensive pretraining (Multiple physics pretraining, MPP) can enable later applications to poorly observed data domains which are common in ocean sciences. Hence, transfer learning capabilities can help improve the oceanographic forecasting, especially in data-limited regimes.

In this work, we explore the capabilities of pretrained transformer models for prediction on a test case for the North Sea. The results from two-dimensional simulations are used for training and fine-tuning. We utilize 2D datasets from publicly available PDEBench together with domain-specific datasets from DHI’s historical records of simulated 2D metocean data. We forecast water levels and currents with pretrained models and evaluate MPP forecast results against in-situ point observations and numerical model results.

Initial findings suggest that pretraining poses potential for generalizing and transferring knowledge to novel regions and relevance in practical application. A challenge is posed by model interpretability, highlighting an area for further development.

How to cite: Cremer, C., Anderson, H., and Mariegaard, J.: Exploring Pretrained Transformers for Ocean State Forecasting, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-17159, https://doi.org/10.5194/egusphere-egu24-17159, 2024.

X5.258

EGU24-347

A Probabilistic Forecast for Multi-year ENSO Using Bayesian Convolutional Neural Network

(withdrawn after no-show)

Sreeraj Puthiyadath, Arya Paul, Balaji Baduru, and Francis Pavanathara

X5.259

EGU24-18493

ECS

Highlight

Leveraging recent advances in Large Language Models for the ocean science community

Redouane Lguensat

Large Language Models (LLMs) have made significant strides in language understanding, including natural language processing, summarization, and translation, and they have the potential to be applied to a range of climate-related challenges. For instance, LLMs can be leveraged for data cleaning and transformation, and also assisting scientists/engineers in their daily work tasks.

For the machine learning community, the year 2023 was arguably the year of breakthroughts in LLM use in production. I present in this work the exciting potential for recent advances in LLMs to revolutionize how the ocean science community can interact with computer code, information gathering, dataset finding, etc. Specifically, I will present simple applications of how these advancements in Natural Language Processing (NLP) can assist the NEMO ocean model community. Examples range from using question answering systems for browsing efficiently NEMO documentation to creating conversational agents or chatbots that can assist not only new members wanting to learn about the NEMO model but also confirmed users.

An important aspect of this work is relying only on open source LLMs, evaluating the performances of several models and discussing the ethical implications of these tools. I also discuss the question of whether using these LLMs blindly without domain knowledge is a good idea, as an important chunk of this work can arguably be easily done by anyone with good computer science skills thanks to the democratization of data science tools and learning materials.

How to cite: Lguensat, R.: Leveraging recent advances in Large Language Models for the ocean science community, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-18493, https://doi.org/10.5194/egusphere-egu24-18493, 2024.

X5.260

EGU24-20454

ECS

Arctic Processes Under Ice: Structures in a Changing Climate

Owen Allemang

The Arctic region is undergoing unprecedented transformations due to Arctic amplification, warming at twice the global average rate. This warming has led to a drastic reduction in sea ice, with predictions of ice-free Arctic summers before 2050. Such profound changes signal a shift to a new climatic regime, posing significant risks to regional communities, industries, and ecosystems.

This research addresses the urgent need to understand the evolving Arctic environment by harnessing machine learning (ML) to analyse sparse oceanic data. Utilising nearly two decades of Ice Tethered Profilers (ITP) data, complemented by ship-based (U-DASH), and ARGO profiles, this study aims to investigate the structure and dynamics of the Arctic Ocean.

We fit a Gaussian Mixture Model (GMM) to our observations, assigning each data point into a different cluster or class. Despite no spatial information being provided to the model, we find coherent classes emerge. We analyse the properties of each class, compare them with standard water masses from the literature, and look at decadal trends in properties such as oxygen saturation. This approach promises to enhance our understanding of Arctic water masses and their evolving role in a changing environment.

How to cite: Allemang, O.: Arctic Processes Under Ice: Structures in a Changing Climate, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-20454, https://doi.org/10.5194/egusphere-egu24-20454, 2024.

X5.261

EGU24-22070

ECS

Highlight

Pushing the Limits of Subseasonal-to-Seasonal Sea Ice Forecasting with Deep Generative Modelling

Andrew McDonald, Jonathan Smith, Peter Yatsyshin, Tom Andersson, Ellen Bowler, Louisa van Zeeland, Bryn Ubald, James Byrne, María Pérez-Ortiz, Richard E. Turner, and J. Scott Hosking

Conventional studies of subseasonal-to-seasonal sea ice variability across scales have relied upon computationally expensive physics-based models solving systems of differential equations. IceNet, a deep learning-based sea ice forecasting model under development since 2021, has proven competitive to such state-of-the-art physics-based models, capable of generating daily 25 km resolution forecasts of sea ice concentration across the Arctic and Antarctic at a fraction of the computational cost once trained. Yet, these IceNet forecasts leave room for improvement through three main weaknesses. First, the forecasts exhibit physically unrealistic spatial and temporal blurring characteristic of deep learning methods trained under mean loss objectives. Second, the use of 25 km scale OSISAF data renders local forecasts along coastal regions and in regions surrounding maritime vessels inconclusive. Third, the sole provision of sea ice concentration in forecasts leaves questions about other critical ice properties such as thickness unanswered. We present preliminary results addressing these three challenges, turning to deep generative models to capture forecast uncertainty and improve spatial sharpness; leveraging 3 and 6 km scale AMSR-2 sea ice products to improve spatial resolution; and incorporating auxiliary datasets, chiefly thickness, into the training and inference pipeline to produce multivariate forecasts of sea ice properties beyond simple sea ice concentration. We seek feedback for improvement and hope continued development of IceNet can help answer key scientific questions surrounding the state of sea ice in our changing polar climates.

How to cite: McDonald, A., Smith, J., Yatsyshin, P., Andersson, T., Bowler, E., van Zeeland, L., Ubald, B., Byrne, J., Pérez-Ortiz, M., Turner, R. E., and Hosking, J. S.: Pushing the Limits of Subseasonal-to-Seasonal Sea Ice Forecasting with Deep Generative Modelling , EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-22070, https://doi.org/10.5194/egusphere-egu24-22070, 2024.

X5.262

EGU24-4126

ECS

Revealing Machine Learning's potential for morphotectonic analysis of marine faults: Application to the North-South faults in the Alboran Sea (Westernmost Mediterranean)

Ariadna Canari, Léa Pousse-Beltran, Sophie Giffard-Roisin, Hector Perea, and Sara Martínez – Loriente

Seismic hazard assessment requires a detailed understanding of the evolution of fault systems, rupture processes, and linkage between segments. Identifying and characterizing Quaternary surface faulting features, such as fault scarps, provide valuable morphotectonic data on cumulative displacement over time, especially in regions with moderate to low seismic activity. Although fault cumulative vertical surface offsets have been traditionally measured using topographic profiles, these profiles are unevenly spread along the faults and may not reflect all the morphological changes along them. To address this situation, expanding the analysis to encompass a larger number of profiles is a viable option; nevertheless, manually executing this task would prove significantly time-consuming. Machine Learning (ML) has shown unprecedented capacities to evaluate large datasets in reduced time and provide a wealth valuable information with their related uncertainties. With this in mind, we propose a ML algorithm called ScarpLearn based on Convolutional Neural Networks (CNN) to compute the vertical cumulative displacement and its uncertainty for normal fault scarps. Despite ScarpLearn being initially developed to characterize simple scarps in onshore areas, we have enhanced its capabilities so that it can also be used in offshore areas subject to oceanic processes. This includes, among others, more intense diffusion, or the presence of seabed features such as pockmarks. Additionally, we have improved the code's versatility by providing a method modification that allows it to better characterization of scarps in more complex areas where multiple faults offset the seafloor. To this, we have trained the algorithm using a large database of realistic synthetic bathymetric profiles, including different parameters such as fault dip, slip velocity, scarp spread, scarp diffusion coefficient, and variable resolutions to ensure adaptability to all datasets. These modifications have resulted in the improvement of the ScarpLearn algorithm’s adaptability, elevating its accuracy and reliability in capturing the complexity of marine fault systems, but also applicable to terrestrial systems. We have applied the new ScarpLearn version to the North-South faults of the northern Alboran Sea, contributing to the accurate analysis of this Plio-Quaternary transtensional system and its complex geological structures. This innovative approach has allowed us to refine the morphotectonic analysis of the area and to understand better the geodynamics of the North-South faults system. In this research, we have explored the advances of the CNN method achieved in oceanic environments, considering intensive data compilation, computational time, accuracy, uncertainties, and current limitations. Our advances demonstrate the ScarpLearn ML potential, specifically tailored to analyze marine environments and multiple fault segments both onshore and offshore. Our research results contribute to the progress of marine geosciences by improving morphotectonic analysis using ML algorithms.

Keywords: Convolutional Neural Networks (CNN), Oceanic processes, Normal faults, Multiple scarps.

How to cite: Canari, A., Pousse-Beltran, L., Giffard-Roisin, S., Perea, H., and Martínez – Loriente, S.: Revealing Machine Learning's potential for morphotectonic analysis of marine faults: Application to the North-South faults in the Alboran Sea (Westernmost Mediterranean), EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-4126, https://doi.org/10.5194/egusphere-egu24-4126, 2024.

Posters virtual: Thu, 18 Apr, 14:00–15:45 | vHall X4

Display time: Thu, 18 Apr, 08:30–18:00

Chairperson: Julien Brajard

vX4.12

EGU24-8790

ECS

Harvesting Global Oceanic Lead Data Using Machine Learning: Standardization, Alignment and Spatio-temporal Puzzle Assembly with Large Language Model

(withdrawn)

Yiming Liu, Shiyu Liang, and Xinbing Wang

vX4.13

EGU24-20799

ECS

Size classification of particulate organic carbon concentration and its link to the ecosystem based on Machine Learning techniques.

Anna Denvil-Sommer, Corinne Le Quere, Rainer Kiko, Erik Buitenhuis, Marie-Fanny Racault, and Fabien Lombard

Biogeochemical ocean models are usually based on two size classes for particulate organic matter: small classes (1-100 𝜇m) and large classes (100-5000 𝜇m). Based on the measurements of particulate organic carbon (POC) concentration from UVP5 profiles and observations of environmental and ecosystem conditions we estimated an optimal number of size classes for POC that can be introduced in biogeochemical ocean models.

We identified four size classes based on the correlation between POC concentration and environmental and ecosystem variables. It gives us information on the relationship between POC and surrounding temperature, chlorophyll-a concentration, nitrate, phosphate and oxygen levels as well as plankton functional types (PFTs).

Further, we applied Machine Learning methods to reconstruct size classes of POC concentration and identify the most important drivers for each class. We showed that the concentration of POC smaller than 0.3 mm mostly depends on environmental characteristics while concentration of POC bigger than 0.3 mm strongly depends on PFTs.

How to cite: Denvil-Sommer, A., Le Quere, C., Kiko, R., Buitenhuis, E., Racault, M.-F., and Lombard, F.: Size classification of particulate organic carbon concentration and its link to the ecosystem based on Machine Learning techniques., EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-20799, https://doi.org/10.5194/egusphere-egu24-20799, 2024.