An operational way of SAR feature creation to facilitate machine learning analyses
- Department of Earth Observation Science, Faculty of Geo-information Science and Earth Observation (ITC), University of Twente, Enschede 7522 NH, The Netherlands; x.zhang-7@utwente.nl.
Satellite missions have delivered a wealth of SAR images for Earth monitoring applications since the 1990s. Due to the complex nature of SAR images and a limited amount of accessible SAR labeling data, these images remain underutilized in providing reference information for machine learning. In response to this gap, we designed a SAR feature creation workflow in an operational framework by releasing Jupyter tools to the public. The workflow is developed upon Doris-5 and consists of two streams. The first stream utilizes SAR images to generate basic SAR and SAR interferometric and polarimetric features. The second stream capitalizes on other available geospatial datasets, such as optical images, cadastral and geological maps, to generate additional features for SAR data that can be treated as reference data. They are first radar-coded to align with the extracted SAR features and then geo-coded in geographic coordinates. All SAR features are concatenated as separate layers in the NetCDF data format, which contains STAC (spatio-temporal asset catalogs) for the data querying.
For the demonstration, an area in the province of Groningen, the Netherlands, was selected as the test site. Seven ascending Sentinel-1A images in VV and VH modes on track 15 between January and March 2022 were used, along with the topographic base map – TOP10NL dataset as a reference. The extracted features encompass VV amplitude, VH amplitude, VV interferometric phase, VV coherence, intensity summation, intensity difference, intensity ratio, cross-pol correlation coefficient, cross-pol cross product, entropy, buildings, roads, water and railways. The first ten features were created via the first stream, while the last four features via the second stream. By applying a random forest classifier to these fourteen SAR features, the model resulted into four types of classified SAR images: building, road, water and railway. The overall accuracy was 0.8558, 0.9939, 0.9065, and 0.8191, with corresponding F1-scores of 0.9191, 0.9669, 0.9490, and 0.9006, respectively.
We conclude that the created SAR features well facilitate machine learning, and that even a simple random forest classification can yield relatively high-accuracy results. In addition, our workflow to create SAR features is well suited to prepare labeled features for machine learning analyses that are even friendly to a user with limited knowledge of SAR.
How to cite: Zhang, X., Chang, L., and Stein, A.: An operational way of SAR feature creation to facilitate machine learning analyses, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-2017, https://doi.org/10.5194/egusphere-egu24-2017, 2024.