Explainable machine learning for detailed wetland vegetation classification using remote sensing data

Tomasz Berezowski

doi:https://doi.org/10.5194/egusphere-egu26-5240

[Back] [Session HS3.6]

EGU26-5240, updated on 13 Mar 2026

https://doi.org/10.5194/egusphere-egu26-5240

EGU General Assembly 2026

© Author(s) 2026. This work is distributed under
the Creative Commons Attribution 4.0 License.

Explainable machine learning for detailed wetland vegetation classification using remote sensing data

Tomasz Berezowski

Gdansk University of Technology, Faculty of Electronics, Telecommunications and Informatics, Gdansk, Poland (tomberez@eti.pg.edu.pl)

Vegetation mapping is a key step in wetland monitoring, management, and conservation. Remote sensing image classification offers an excellent solution for vegetation mapping due to its high temporal and spatial resolution. In spite of these advantages, remote sensing classification of wetland vegetation is usually limited to a small number of target classes and lack explanation of the input features importance. To address this limitation, this study presents a detailed wetland vegetation classification, which is followed by an explainability study.

The study was conducted in the Biebrza wetlands located in NE Poland, covering approximately 220km². These wetlands are situated around the Biebrza River, which floods yearly, producing a characteristic vegetation zonation. The training and validation data for vegetation classification was a vegetation survey conducted in 2015 and kindly provided by the Biebrza National Park.

The input features for classification was obtained from fusing VIS-IR data from Sentinel-2, thermal data from Landsat-8, and Synthetic Aperture Radar (SAR) data from Sentinel-1. The Sentinel-2 data consisted of four images (one image per season), each with eleven bands. The Landsat-8 data also comprised four images, with one thermal band per image. The Sentinel-1 data included 24 dual-polarization (VV+VH) images (one image per month, varied by ascending and descending orbit). All image data were acquired within the 2014-2017 period and resampled to 10-meter spatial resolution.

The "ranger" Random Forest implementation in R was used as the classifier. The classifier was trained on a stratified random 50% of the vegetation data points and validated on the remaining 50%. The built-in permutation feature importance algorithm was used to indicate the most important bands for the classification.

The classification-based vegetation map highly reflected the characteristic vegetation zonation of the Biebrza wetlands. The overall accuracy was 0.994 and the Kappa index was 0.993. The most important band for the classification was the Landsat-8 thermal image from the winter season. However, the thermal bands from the remaining seasons were relatively unimportant. The next most important bands were the Sentinel-2 VIS-IR images from the spring and fall seasons, particularly the red, red-edge, and SWIR bands. The SAR data from Sentinel-1 were the least important of all data used; the most important Sentinel-1 band (19th position) was VH from September, descending orbit.

How to cite: Berezowski, T.: Explainable machine learning for detailed wetland vegetation classification using remote sensing data, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-5240, https://doi.org/10.5194/egusphere-egu26-5240, 2026.