- 1University of Trieste, Italy
- 2National Institute of Oceanography and Applied Geophysics - OGS
Chlorophyll concentration presents important implications in marine ecosystems (e.g eutrophication and proxy for phytoplankton abundance). Chlorophyll can be indirectly (satellite) or directly (insitu) observed and estimated through deterministic models. However, all these estimations present some limits: deterministic models cover the whole 3D domain but they can be inaccurate, while observations, highly accurate, are too sparse. Their integration through model-data fusion approach represents a new frontier for biogeochemical modeling.
We present a deep-learning approach for modeling the 3D distribution of biogeochemical variables in the Mediterranean Sea. Specifically, this work focuses on generating new 3D maps of chlorophyll-a by (a) modeling its relationship with physical variables, whose 3D-distribution is provided by the CMEMS physical numerical model, and (b) merging in-situ observations (i.e. BGC-Argo). The resulting 3D map offers a more accurate prediction leveraging the inclusion of Argo-float measurements, which are characterized by more accurate predictions than numerical model outputs.
This provides a tool that, given a 3D distribution of physical variables and sparse measurements of a biogeochemical variable, yields a 3D reconstruction of such biogeochemical variables. The novelty of this method lies in its ability to improve the accuracy of biogeochemical variable predictions by incorporating 1D Argo-float data into a 3D context, thus extending localized measurements over larger spatial domains.
The neural network models the relationship between physical variables and chlorophyll using numerical data (from BFM model) as a baseline. Since numerical models introduce approximation errors, a second training corrects these inaccuracies by incorporating Argo-float data.
We adopt a convolutional neural network (CNN), a deep learning architecture specifically designed to capture spatial correlation patterns. CNNs, commonly used for image reconstruction tasks, treat the 3D field (with an horizontal resolution of ⅛ x ⅛ degree for 30 vertical levels) as an image, replacing canonical RGB values by physical and biogeochemical variables.
To incorporate data from different sources, the training is divided into two-step: firstly, the network learns how to reproduce chlorophyll-a distribution with BFM model data, while secondly it incorporates Argo-float chlorophyll measures. In this way, Argo-float data are integrated into an already trained framework, thus entirely absorbing and expanding their information.
Trained on weekly data in the years 2019-2021 and tested on 2022, CNN shows the capability of reproducing chlorophyll maps mimicking BFM data, which are improved in the second step through the use of Argo-float. Results show the effectiveness of the proposed two-step method, since the use of BGC-Argo data not only leads the reconstruction closer to data itself but allows corrections to spread in the 3D domain.
To summarize, this approach exploits CNNs for the resolution of a re-mapping problem including different data sources. The two-step training procedure provides a new simple and intuitive method to efficiently merge sparse and incomplete data into a 3D seamless domain.
How to cite: Tonelli, T., Pietropolli, G., Cossarini, G., and Manzoni, L.: Convolutional neural networks for chlorophyll prediction in the Mediterranean Sea, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-6062, https://doi.org/10.5194/egusphere-egu25-6062, 2025.