EGU25-17277, updated on 15 Mar 2025
https://doi.org/10.5194/egusphere-egu25-17277
EGU General Assembly 2025
© Author(s) 2025. This work is distributed under
the Creative Commons Attribution 4.0 License.
Hierarchical Classification of Forest Types and Tree Species Using Multi-Resolution Hyperspectral Data and Pseudo-Labeling for Enhanced Model Training
Shravan Ambudkar1, Jeremy Kravitz2, and Subash Yeggina3
Shravan Ambudkar et al.
  • 1Pixxel Space, Research and Analytics, India (shravan.ambudkar@pixxel.co.in)
  • 2Pixxel Space Technologies, El Segundo, CA, USA (jeremy@pixxel.space)
  • 3Pixxel Space India Pvt. Ltd. Bengaluru, Karnataka, India (subash@pixxel.co.in)

Accurate classification of forest types and tree species is an important aspect of forest monitoring but it requires vast amounts of spatio-temporal data. Remotely sensed data provides a viable solution for acquiring the necessary global-scale information. Historically multispectral data has been used for forest monitoring, but the limited number of spectrally-broad bands often do not provide sufficient differentiation between similar tree species, lowering classification accuracy. Hyperspectral data offers improved spectral resolution which enables to differentiate similar tree species. Nevertheless, the quality of ground truth data used for classification remains a challenge, as it is often limited and noisy.

This study presents a hierarchical, three-stage classification approach utilizing hyperspectral data, cascaded machine learning models and spectral unmixing algorithms to classify forest types and individual tree species. The approach integrates coarse level dataset for broad level classification and finer resolution hyperspectral imagery for fine-scale spectral and structural variability. Furthermore, to address the possibility of low quality ground truth labels we propose a semi-supervised training framework leveraging pseudo-labeling.

The cascaded three-stage architecture sequentially processes the data, with each stage consisting of an XGBoost model trained to address specific challenges. The first stage is a coarse classifier, classifying forest into three broad categories: Evergreen, Deciduous, and Mixed. This model is trained on coarse resolution 60m GSD EMIT data and supervised labels generated using the National Land Cover Database. The second stage further refines the three classes into 28 different forest group types labels as defined by the USDA Forest Service's Forest Inventory and Analysis (FIA). The third and the final stage classifies each of the forest pixels by its dominant tree species, leveraging the outputs from the previous stage and AVIRIS-NG high resolution 4m GSD hyperspectral data as additional input features. Non-dominant tree species are identified using Vertex Component Analysis based spectral unmixing and classified into pure tree species spectras using spectral similarity metrics. The abundances of dominant and non-dominant spectras are then mapped using the Fully Constrained Least Squares approach.

This method was tested over two regions: Shasta-Trinity National Forest, California, USA and Grand Mesa National Forest, Colorado, USA. The resulting tree distribution mapped 10 different individual tree species and were validated against USDA’s Treemap product . For these test regions the resulting overall accuracy from the entire 3-stage model is 80%. The individual stage accuracies for stage 1, stage 2, and stage 3 classification, were 94%, 92%,  and 92% respectively.

Despite these promising results, the approach is constrained by the availability of high-quality ground-truth data for supervised training. To address this, a pseudo-labeling technique that generates additional training data by iteratively assigning labels to unlabeled samples with high model confidence was explored. The preliminary results indicate that the inclusion of pseudo-labeled data training can enhance the classification accuracy of the proposed hierarchical cascaded approach for forest applications.

How to cite: Ambudkar, S., Kravitz, J., and Yeggina, S.: Hierarchical Classification of Forest Types and Tree Species Using Multi-Resolution Hyperspectral Data and Pseudo-Labeling for Enhanced Model Training, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-17277, https://doi.org/10.5194/egusphere-egu25-17277, 2025.