EGU24-19157, updated on 11 Mar 2024
EGU General Assembly 2024
© Author(s) 2024. This work is distributed under
the Creative Commons Attribution 4.0 License.

Estimating global POC fluxes using ML and data fusion on heterogeneous and sparse in situ observations

Abhiraami Navaneethanathan1, Bb Cael2, Chunbo Luo1, Peter Challenor1, Adrian Martin2, and Sabina Leonelli1
Abhiraami Navaneethanathan et al.
  • 1University of Exeter, Exeter, United Kingdom (
  • 2National Oceanography Centre, Southampton, United Kingdom

The ocean biological carbon pump, a significant set of processes in the global carbon cycle, drives the sinking of particulate organic carbon (POC) towards the deep ocean. Global estimates of POC fluxes and an improved understanding of how environmental factors influence organic ocean carbon transport can help quantify how much carbon is sequestered in the ocean and how this can change in different environmental conditions, in addition to improving global carbon and marine ecosystem models. POC fluxes can be derived from observations taken by a variety of in situ instruments such as sediment traps, 234-Thorium tracers and Underwater Vision Profilers. However, the manual and time-consuming nature of data collection leads to limitations of spatial data sparsity on a global scale, resulting in large estimate uncertainties in under-sampled regions.

This research takes an observation-driven approach with machine learning and statistical models trained to estimate POC fluxes on a global scale using the in situ observations and well-sampled environmental driver datasets, such as temperature and nutrient concentrations. This approach holds two main benefits: 1) the ability to fill observational gaps on both a spatial and temporal scale and 2) the opportunity to interpret the importance of each environmental factor for estimating POC fluxes, and therefore exposing their relationship to organic carbon transport processes. The models built include random forests, neural networks and Bayesian hierarchical models, where their global POC flux estimates, feature importance and model performances are studied and compared. Additionally, this research explores the use of data fusion methods to combine all three heterogeneous in situ POC flux data sources to achieve improved accuracy and better-informed inferences about organic carbon transport than what is possible using a single data source. By treating the heterogeneous data sources differently, accounting for their biases, and introducing domain knowledge into the models, our data fusion method can not only harness the information from all three data sources, but also gives insights into their key differences.

How to cite: Navaneethanathan, A., Cael, B., Luo, C., Challenor, P., Martin, A., and Leonelli, S.: Estimating global POC fluxes using ML and data fusion on heterogeneous and sparse in situ observations, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-19157,, 2024.

Supplementary materials

Supplementary material file

Comments on the supplementary material

AC: Author Comment | CC: Community Comment | Report abuse

supplementary materials version 1 – uploaded on 15 Apr 2024, no comments