- 1NITK Surathkal, MACS, India
- 2Professor, IMT Atlantique, Lab-STICC, INRIA team ODYSSEY AI Chair OceaniX (AI & Ocean) Technopole Brest-Iroise , France
- 3Associate Professor IMT Atlantique, Brest, France
- 4Directrice de Recherche IRD (French Research Institute for Sustainable Development ) Laboratoire d'Océanographie Physique et Spatiale (LOPS) Brest, France
- 5Institut de Recherche pour le Développement Laboratoire d'Océanographie Physique et Spatiale Institut Universitaire Européen de la Mer
- 6Post-doctorant, Lab-STICC IMT Atlantique, Brest, France
Phytoplankton play a key role in maintaining marine ecosystems and regulating global carbon dioxide concentrations through photosynthesis. Thus, it is crucial to assess and understand their temporal variations. However, fluctuations of phytoplankton biomass on multi-decadal and longer timescales remain uncertain, in contrast to seasonal and interannual ones, due to the lack of long-term observations on a global scale and the uncertainties related to the complex balance of processes that control their fate. As phytoplankton growth depends on the availability of nutrients in the sunlit upper ocean, which is closely linked to the stratification of the ocean, one can assume that at first order changes in phytoplankton is related to changes in ocean and atmosphere dynamics.
Over the last few years, several conventional data-driven deterministic approaches have been trained from physical observations (used as predictors) to reconstruct satellite ocean color time series (i.e., Chlorophyll-a concentration, Chl, which is used as a proxy of the phytoplankton biomass) and investigate their multi-decadal variability. Deterministic methods, such as encoder-decoder architecture U-Net, LSTM, FourCastNet, are robust but tend to fail in capturing probabilistic uncertainty because they produce deterministic outcomes. Additionally, these methods struggle with handling extreme and highly complex real-world scenarios. This study proposes a novel application of score-based generative diffusion models to address these challenges and present a comparative analysis against U-Net and FourCastNet. Probabilistic conditional diffusion model has been pretrained on simulation data and subsequently fine-tuned to learn the parameters using satellite observation data. This generative model learns the inherited uncertainty by generating ensembles of possible Chl mapping and analyzing the variability within the ensemble. The model can then be sampled efficiently to produce realistic Chl ensembles, conditioned on physical predictors and the baseline model U-Net. The ensembles from the diffusion model show greater reliability and accuracy, particularly in extreme event classification.
Our results demonstrate that when conditioned with a U-Net (meaning this input together with eight physical predictors), diffusion behaves better than the baseline method, especially when the number of samples is increased. It is visible from the spatial maps of standard deviation that as the sample size increases, the model's predictions stabilize and become more concentrated around the mean which leads to a reduction in the spread of outcomes.
How to cite: Lakra, M., Fablet, R., Drumetz, L., Martinez, E., Pauthenet, E., and Nga Nguyen, T. T.: Probabilistic Diffusion Models for Ocean Chlorophyll-a Prediction, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-18806, https://doi.org/10.5194/egusphere-egu25-18806, 2025.