Comparison of deep learning methods for colorizing historical aerial imagery

Shimon Tanaka; Hitoshi Miyamoto; Ryusei Ishii; Patrice Carbonneau

doi:https://doi.org/10.5194/egusphere-egu22-7686

[Back] [Session GM2.7]

EGU22-7686

https://doi.org/10.5194/egusphere-egu22-7686

EGU General Assembly 2022

© Author(s) 2022. This work is distributed under
the Creative Commons Attribution 4.0 License.

Comparison of deep learning methods for colorizing historical aerial imagery

Shimon Tanaka¹, Hitoshi Miyamoto², Ryusei Ishii³, and Patrice Carbonneau⁴

Shimon Tanaka et al.

¹Shibaura Institute of Technology, Tokyo, Japan (ah18061@shibaura-it.ac.jp)
²Shibaura Institute of Technology, Tokyo, Japan (miyamo@shibaura-it.ac.jp)
³Shibaura Institute of Technology, Tokyo, Japan (mh21006@shibaura-it.ac.jp)
⁴Durham University, Durham, United Kingdom (patrice.carbonneau@durham.ac.uk)

Historical aerial imagery dating back to the mid-twentieth century offers high potential to distinguish anthropogenic impacts from natural causes of environmental change and reanalyze the long-term surface evolution from local to regional scales. However, the older portion of the imagery is often acquired in panchromatic grayscale thus making image classification a very challenging task. This research aims to compare deep learning image colorisation methods, namely, , the Neural Style Transfer (NST) and the Cycle Generative Adversarial Network (CycleGAN), for colorizing archival images of Japanese river basins for land cover analysis. Historical monochrome images were examined with `4096 x 4096` pixels of three river basins, i.e., the Kurobe, Tenryu, and Chikugo Rivers. In the NST method, we used the transfer learning model with optimal hyperparameters that had already been fine-tuned for the river basin colorization of the archival river images (Ishii et al., 2021). As for the CycleGAN method, we trained the CycleGAN with 8000 image tiles of `256 x256` pixels to obtain the optimal hyperparameters for the river basin colorization. The image tiles used in training consisted of 10 land-use types, including paddy fields, agricultural lands, forests, wastelands, cities and villages, transportation land, rivers, lakes, coastal areas, and so forth. The training result of the CycleGAN reached an optimal model in which the root mean square error (RMSE) of colorization was 18.3 in 8-bit RGB color resolution with optimal hyperparameters of the dropout ratio (0.4), cycle consistency loss (10), and identity mapping loss (0.5). Colorization comparison of the two-deep learning methods gave us the following three findings. (i) CycleGAN requires much less training effort than the NST because the CycleGAN used an unsupervised learning algorithm. CycleGAN used 8000 images without labelling for training while the NST used 60k with labelling in transfer learning. (ii) The colorization quality of the two methods was basically the same in the evaluation stage; RMSEs in CycleGAN were 15.4 for Kurobe, 13.7 for Tenryu and 18.7 for Chikugo, while RMSE in NST were 9.9 for Kurobe, 15.8 for Tenryu, and 14.2 for Chikugo, respectively. (iii) The CycleGAN indicated much higher performance on the colorization of dull surfaces without any textual features, such as the river course in Tenryu River, than the NST. In future research work, colorized imagery by both the NST and CycleGAN will be further used for land cover classification with AI technology to investigate its role in image recognition. [Reference]: Ishii, R. et al.(2021) Colorization of archival aerial imagery using deep learning, EGU General Assembly 2021, EGU21-11925, https://doi.org/10.5194/egusphere-egu21-11925.

How to cite: Tanaka, S., Miyamoto, H., Ishii, R., and Carbonneau, P.: Comparison of deep learning methods for colorizing historical aerial imagery, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-7686, https://doi.org/10.5194/egusphere-egu22-7686, 2022.

Displays

Display file