EGU21-15917
https://doi.org/10.5194/egusphere-egu21-15917
EGU General Assembly 2021
© Author(s) 2021. This work is distributed under
the Creative Commons Attribution 4.0 License.

Monitoring Temporal Developments from Remote Sensing Data using AI Fine-Grained Segmentation

Samir Zamarialai1,4, Thijs Perenboom2, Amanda Kruijver2, Zenglin Shi1, and Bernard Foing4,3
Samir Zamarialai et al.
  • 1University of Amsterdam, Graduate School of Informatics, Artificial Intelligence, Netherlands
  • 252impact B.V., Rotterdam, The Netherlands
  • 3European Space Agency - ESTEC, The Netherlands
  • 4VU Amsterdam, The Netherlands

Remote sensing (RS) imagery, generated by e.g. cameras on satellites, airplanes and drones, has been used for a variety of applications such as environmental monitoring, detection of craters, monitoring temporal changes on planetary surfaces.

In recent years, researchers started applying Computer Vision [TP1] methods on RS data. This led to a steady development of remote sensing classification, providing good results on classification and segmentation tasks on RS data.  However, there are still problems with current approaches. Firstly, the main focus is on high-resolution RS imagery. Apart from the fact that these data are not accessible to everyone, the models fail to generalize on lower resolution data. Secondly, the models fail to generalize on more fine-grained classes. For example, models tend to generalize very well on detecting buildings in general, however they fail to distinguish if a building belongs to a fine-grained subclass like residential or commercial buildings. Fine-grained classes often appear very similar to each other, therefore, models have problems to distinguish between them. This problem occurs both in high-resolution and low-resolution RS imagery, however the drop in accuracy is much more significant when using lower resolution data.

For these reasons, we propose a Multi-Task Convolutional Neural Network (CNN) with three objective functions for segmentation of RS imagery. This model should be able to generalize on different resolutions and receive better accuracy than state-of the-art approaches, especially on fine-grained classes.

The model consists of two main components. The first component is a CNN that transforms the input image to a segmentation map. This module is optimized with a pixel-wise Cross-Entropy loss function between the segmentation map of the model and the ground truth annotations. If the input image is of lower resolution, this segmentation map will miss out on the complete structure of input images. The second component is another CNN to build a high-resolution image from the low-resolution input image in order to reconstruct fine-grained structure information. This module essentially guides the model to learn more fine-grained feature representations. The transformed image from this module will have much more details like sharper edges and better color. The second CNN module is optimized with a Mean-Squared-Error loss function between the original high-resolution image and the transformed image. Finally, the two images created by the model are then evaluated by a third objective function that aims to learn the distance of similarity between the segmented input image and the super-high resolution segmentation. The final objective function consists of a sum of the three objectives mentioned above. After the model is finished with training, the second module should be detached, meaning high-resolution imagery is only needed during the training phase.

At the moment we are implementing the model. Afterwards, we will benchmark the model against current state of the art approaches. The status will be presented at EGU 2021.­

How to cite: Zamarialai, S., Perenboom, T., Kruijver, A., Shi, Z., and Foing, B.: Monitoring Temporal Developments from Remote Sensing Data using AI Fine-Grained Segmentation, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-15917, https://doi.org/10.5194/egusphere-egu21-15917, 2021.

Displays

Display file