Linking Forest Inventories and Multi-Modal Deep Learning for Tree Species Classification

Dimitri Gominski; Daniel Ortiz Gonzalo; Wanting Yang; Martin Brandt; Rasmus Fensholt

doi:https://doi.org/10.5194/wbf2026-722

[Back] [Session IND10]

WBF2026-722, updated on 10 Mar 2026

https://doi.org/10.5194/wbf2026-722

World Biodiversity Forum 2026

© Author(s) 2026. This work is distributed under
the Creative Commons Attribution 4.0 License.

Poster | Monday, 15 Jun, 16:30–18:00 (CEST), Display time Monday, 15 Jun, 08:30–Tuesday, 16 Jun, 18:00|

Linking Forest Inventories and Multi-Modal Deep Learning for Tree Species Classification

Dimitri Gominski¹, Daniel Ortiz Gonzalo², Wanting Yang¹, Martin Brandt¹, and Rasmus Fensholt¹

Dimitri Gominski et al.

¹Department of Geosciences and Natural Resource Management, University of Copenhagen
²Universidad Politécnica de Madrid

Accurate, large-scale classification of tree species -both in forests and for trees outside forests- is essential for monitoring vegetation composition and ecosystem health. Such information strengthens our capacity to detect early-stage biodiversity loss, prioritize conservation interventions, and manage ecosystems more sustainably. Despite increasing availability of high-resolution Earth observation data, operational species-level mapping at continental scales remains limited by heterogeneous sensor characteristics, uneven species distributions, and the difficulty of linking in situ information to multi-sensor imagery.

We developed a multi-modal deep learning framework to classify tree species across the Iberian Peninsula. We collected National Forest Inventory (NFI) plot data from mainland Spain paired with Sentinel-1 and Sentinel-2 time series, aerial imagery, aerial lidar, and species presence likelihood derived from bioclimatic variables. Our dataset comprises approx. 40,000 NFI plots with plot-level species counts, while the imagery covers 406 km² and spans ground sampling distances from 20 cm to 10 m. The temporal dimension is captured through a 14-day composite time series, enabling the model to leverage phenological and structural variation throughout the year, while aerial lidar provides fine-grained canopy structure. Our dataset provides fertile ground for exploring multi-modal, multi-scale interactions in high-resolution species modeling.

Building on recent advances in foundation models, we implemented a deep neural network (AnySat) fusing these diverse modalities and obtained a scalable, operational high-resolution classifier. The resulting classifier achieved an overall F1 score of 0.70 for the 44 most common species at plot level, demonstrating strong performance across diverse biomes and imaging conditions. Rare, non-dominant species remain a challenge due to the long-tailed distribution of species occurrences. Based on a systematic analysis of modality relevance, we outline strategies for balancing performance with inspiration from long-tailed recognition and semi-supervised learning. Altogether, our dataset and modeling framework advance high-resolution species mapping with remote sensing and illustrate the substantial gains that can be achieved by moving beyond single-modality approaches.

How to cite: Gominski, D., Ortiz Gonzalo, D., Yang, W., Brandt, M., and Fensholt, R.: Linking Forest Inventories and Multi-Modal Deep Learning for Tree Species Classification, World Biodiversity Forum 2026, Davos, Switzerland, 14–19 Jun 2026, WBF2026-722, https://doi.org/10.5194/wbf2026-722, 2026.