EGU25-20406, updated on 15 Mar 2025
https://doi.org/10.5194/egusphere-egu25-20406
EGU General Assembly 2025
© Author(s) 2025. This work is distributed under
the Creative Commons Attribution 4.0 License.
Poster | Wednesday, 30 Apr, 16:15–18:00 (CEST), Display time Wednesday, 30 Apr, 14:00–18:00
 
Hall X5, X5.90
An uncertainty quantification framework for data-driven carbon flux upscaling
Qi Yang, Sophia Walther, Jacob Nelson, Gregory Duveiller, Zayd Hamdi, and Martin Jung
Qi Yang et al.
  • Max Planck Institute for Biogeochemistry, Department of Biogeochemical Integration, (qiyang@bgc-jena.mpg.de)

Data-driven upscaling of biogenic fluxes from eddy covariance (EC) sites to the global scale is a powerful complementary approach to process-based models for the derivation of global flux estimates. Nevertheless, significant uncertainties arise due to specific methodological choices such as data availability, data source differences, machine learning model differences, and feature selection. Accurately quantifying these uncertainties from diverse sources is essential for providing error estimates of the simulated fluxes. These uncertainties not only improve our general understanding of carbon cycle processes but also directly inform atmospheric inversions, which can use the upscaled net ecosystem exchange (NEE) as a prior. However, most existing data-driven global carbon flux products focus solely on flux estimates or provide incomplete uncertainty assessments limited to a few sources.

In this study, we introduce a comprehensive framework for quantifying the uncertainties associated with carbon flux upscaling across potential sources. The framework involves three key steps: (1) pre-ensemble generation, (2) screening, and (3) uncertainty attribution. First, we construct ensemble members by training machine learning models with varying configurations, which include climate forcing datasets, feature combinations, subsets of EC sites, machine learning algorithms, and their hyperparameters. The experiments are supported by the recently developed data-driven modeling framework FLUXCOM-X, which enables a wide range of experiments with diverse methodological choices. We crafted a feature set that includes about 300 features to capture both current and historical state information. To capture the site representativeness uncertainty, we sample subsets from global EC sites based on geolocation and feature space. Additionally, we will also investigate different machine learning models and the variation of hyperparameters to generate the ensemble. Second, ensemble members that have a low contribution to the ensemble variance will be eliminated while we retain the most representative ones. We employ a feature selection algorithm, HybridGA, to screen important subfeature sets from near-infinite combinations. Moreover, we screen other ensemble members by assessing the distribution and spread of members. Finally, we will attribute uncertainties to various categories from the perspectives of machine learning and process-based modeling, and potential strategies to reduce these uncertainties are discussed. The framework is initially used to evaluate spatiotemporal NEE uncertain patterns in Europe, and will subsequently expand globally. Additionally, the estimated biogenic carbon flux uncertainty will be assessed with independent products. This work not only advances our understanding of the sources and patterns of upscaled flux uncertainties but also enhances the robustness of posterior estimates in atmospheric inversion models.

How to cite: Yang, Q., Walther, S., Nelson, J., Duveiller, G., Hamdi, Z., and Jung, M.: An uncertainty quantification framework for data-driven carbon flux upscaling, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-20406, https://doi.org/10.5194/egusphere-egu25-20406, 2025.