EGU23-7624, updated on 09 Jan 2024
https://doi.org/10.5194/egusphere-egu23-7624
EGU General Assembly 2023
© Author(s) 2024. This work is distributed under
the Creative Commons Attribution 4.0 License.

AI-vergreen: a multi-label Sentinel-2 training dataset of summer green (Larix) and evergreen needle leaf forest types in boreal forest biomes for remote sensing applications

Léa Enguehard1, Birgit Heim1, Stefan Kruse1, Begüm Demir2, Robert Jackisch3, Josias Gloy1, Sarah Haupt1, Laura Schild1, Femke Van Geffen1, Veronika Döpper1, Ronny Hänsch4, Nicola Falco5, and Ulrike Herzschuh1,6
Léa Enguehard et al.
  • 1Polar Terrestrial Environmental Systems, Alfred Wegener Institute, Helmholtz Centre for Polar and Marine Research, Potsdam, Germany (lea.enguehard@awi.de)
  • 2Remote Sensing Image Analysis (RSiM) Group, Technische Universität Berlin, Berlin, Germany
  • 3Geoinformation in Environmental Planning Lab, Technische Universität Berlin, Berlin, Germany
  • 4Microwaves and Radar Institute, German Aerospace Center (DLR), Weßling, Germany
  • 5Earth and Environmental Sciences Area, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
  • 6Institute of Environmental Science and Geography, University of Potsdam, Potsdam, Germany

Boreal forests, which represent roughly one-third of the world’s total forested area, provide critical ecosystem services including carbon stocks, climate feedback, permafrost stability, biodiversity, and economic benefits. Located in the northern latitude, they are mainly dominated by evergreen needle-leaf tree taxa (Pinus, Picea, Abies) in North America, Northern Europe, and Western Siberia, and by deciduous needle-leaf tree taxa (Larix) in Eastern Siberia. Remote sensing applications in high latitudes are possible but remain challenging for optical satellite sensors due to frequent cloud coverage, forest fires, and low illumination. Additionally, there is little data available prepared as multi-label datasets for remote sensing applications focusing on the structure of boreal forests, specifically on Larix deciduous trees. Furthermore, labeled datasets of summer green and evergreen forest types for specific satellite sensors would enable remote sensing and deep learning applications such as classification, and ultimately improve our understanding of evergreen and summer green tree dynamics. An example of such a dataset is the TreeSatAI multi-sensor Artificial Intelligence Benchmark Archive (doi.org/10.5281/zenodo.6780578), which provides labels on species and forest composition in Europe. Another one is the SiDroForest data collection, consisting of a synthetic Unmanned Aerial Vehicle (UAV) Siberian Larch Dataset (doi.org/10.1594/PANGAEA.932795) and Sentinel-2 image patches (doi.org/10.1594/PANGAEA.933268) of 54 forest plots in Eastern Siberia. 

Here we are building up an extensive multi-labeled training dataset based on optical Sentinel-2 image patches (60 x 60 m image patch of the 10 m and 20 m S2-bands), including meta-data information on summer green and evergreen tree species and forest structure from vegetation plots. Over 250 vegetation plots were collected since 2011 from nine field expeditions of the Alfred Wegener Institute in Eastern Siberia (doi.org/10.5194/essd-14-5695-2022) and Western Canada, where vegetation was sampled and described, and UAV images were taken (UAV solely in 2021 and 2022). In addition to in-situ plots, we gathered all cloud-free Sentinel-2 data from late spring to early fall (May to October) that geographically coincides with the vegetation plots. Therefore, the dataset contains different phenophases of evergreen and summer green forests and provides detailed label information on forest structure – such as tree species and density. The multi-labeling will include broader and more detailed forest-type classes. Some examples of higher-level labels are “Sparse larch forest” or “Dense evergreen forest’’. The poster will demonstrate how we defined forest labels from in-situ data, UAV, Sentinel-2, and their corresponding spectral signatures.

We anticipate our dataset to be a starting point for a significantly more extensive one with the addition of radar satellite sensors such as Sentinel-1 and TanDEM-X, and other ground vegetation plots (new expedition expected in Alaska and Canada in summer 2023), data search in literature and repositories– e.g. NASA Arctic Boreal Vulnerability Experiment. Our dataset will be publicly available and can be used as a training dataset for deep learning algorithms to identify and characterize evergreen and summer green needle-leaf trees in boreal forest regions.

How to cite: Enguehard, L., Heim, B., Kruse, S., Demir, B., Jackisch, R., Gloy, J., Haupt, S., Schild, L., Van Geffen, F., Döpper, V., Hänsch, R., Falco, N., and Herzschuh, U.: AI-vergreen: a multi-label Sentinel-2 training dataset of summer green (Larix) and evergreen needle leaf forest types in boreal forest biomes for remote sensing applications, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-7624, https://doi.org/10.5194/egusphere-egu23-7624, 2023.

Supplementary materials

Supplementary material file