EGU25-8312, updated on 14 Mar 2025
https://doi.org/10.5194/egusphere-egu25-8312
EGU General Assembly 2025
© Author(s) 2025. This work is distributed under
the Creative Commons Attribution 4.0 License.
Oral | Friday, 02 May, 15:00–15:10 (CEST)
 
Room C
TransFuse: Advancing Frequent Flood Monitoring using Vision Transformers and Earth Observation
Antara Dasgupta1, Paul Christian Hosch1, Rakesh Sahu2, and Björn Waske3
Antara Dasgupta et al.
  • 1RWTH Aachen University, IWW Institute of Hydraulic Engineering, Faculty of Civil Engineering, Aachen, Germany (antara.dasgupta@rwth-aachen.de)
  • 2Computer Science and Engineering Department Galgotias University Noida, India
  • 3Institute of Informatics, Universität Osnabrück, Osnabrück, Germany

The increasing availability of Earth Observation (EO) satellites equipped with active microwave sensors suitable for flood mapping has improved flood monitoring capabilities. However, current observation frequencies still fall short of adequately characterizing inundation dynamics, particularly during critical moments such as the flood peak or maximum inundation extent. This limitation represents a significant research challenge in flood remote sensing. Advances in multimodal satellite hydrology datasets, coupled with the deep learning (DL) revolution, offer new opportunities to address the frequency gap in flood observations. TransFuse presents a scalable data fusion framework that combines DL with EO data to achieve daily, high-resolution flood inundation mapping. This proof-of-concept study highlights the potential of Vision Transformers (ViT) to predict flood inundation at the spatial resolution of Sentinel-1 (S1) imagery. The approach integrates time series data from coarse but temporally frequent datasets, such as soil moisture and precipitation from NASA’s SMAP and GPM missions, with static predictors like topography and land use. A ViT model was trained using flood maps derived from S1 imagery processed by a Random Forest Classifier, allowing the prediction of high-resolution flood inundation. Additionally, a classical UNET convolutional neural network (CNN) was used as a benchmark to compare model performance. Two case studies were used to evaluate this methodology: the December 2019 flood event in southwest France at the confluence of the Adour and Luy rivers, and the Christmas floods of 2023 on Germany’s Hase River. Predicted high-resolution flood maps were validated against independent flood masks derived from S1 images outside the training dataset. Results demonstrate that both ViT and CNN-UNET models effectively generalize the hydrological and hydraulic relationships that drive flood inundation, even in areas with complex topographies. Notably, the ViT model outperformed the CNN, achieving approximately 20% higher accuracy in both case studies. Further testing in diverse catchments with varying land-use, hydrology, and elevation profiles is recommended to assess model sensitivity under differing conditions. The proposed methodology can revolutionize flood monitoring by enabling daily observation of spatial inundation dynamics. This capability could support the development of improved parametric hazard re/insurance products, helping to address the flood protection gap faced by vulnerable populations worldwide.

How to cite: Dasgupta, A., Hosch, P. C., Sahu, R., and Waske, B.: TransFuse: Advancing Frequent Flood Monitoring using Vision Transformers and Earth Observation, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-8312, https://doi.org/10.5194/egusphere-egu25-8312, 2025.