EGU24-18061, updated on 11 Mar 2024
https://doi.org/10.5194/egusphere-egu24-18061
EGU General Assembly 2024
© Author(s) 2024. This work is distributed under
the Creative Commons Attribution 4.0 License.

A new methodology for time-series reconstruction of global scale historical Earth observation data

Davide Consoli, Leandro Parente, and Martijn Witjes
Davide Consoli et al.
  • OpenGeoHub Foundation, (davide.consoli@opengeohub.org)

Several machine learning algorithms and analytical techniques do not allow gaps or non-values in input data. Unfortunately, earth observation (EO) datasets, such as satellite images, are gravely affected by cloud contamination and sensor artifacts that create gaps in the time series of collected images. This limits the usage of several powerful techniques for modeling and analysis. To overcome these limitations, several works in literature propose different imputation methods to reconstruct the gappy time series of images, providing complete time-space datasets and enabling their usage as input for many techniques.

However, among the time-series reconstruction methods available in literature, only a few of them are publicly available (open source code), applicable without any external source of data, and suitable for application to petabyte (PB) sized dataset like the full Landsat archive. The few methods that match all these characteristics are usually quite trivial (e.g. linear interpolation) and, as a consequence, they often show poor performance in reconstructing the images. 

For this reason, we propose a new methodology for time series reconstruction designed to match all these requirements. Like some other methods in literature, the new method, named seasonally weighted average generalization (SWAG), works purely on the time dimension, reconstructing the images working on each time series of each pixel separately. In particular, the method uses a weighted average of the samples available in the original time series to reconstruct the missing values. Enforcing the annual seasonality of each band as a prior, for the reconstruction of each missing sample in the time series a higher weight is given to images that are collected exactly on integer multiples of a year. To avoid propagation of land cover changes in future or past images, higher weights are given to more recent images. Finally, to have a method that respects causality, only images from the past of each sample in the time series are used.

To have computational performance suitable for PB sized datasets the method has been implemented in C++ using a sequence of fast convolution methods and Hadamard products and divisions. The method has been applied to a bimonthly aggregated version of the global GLAD Landsat ARD-2 collection from 1997 to 2022, producing a 400 terabyte output dataset. The produced dataset will be used to generate maps for several biophysical parameters, such as Fraction of Absorbed Photosynthetically Active Radiation (FAPAR), normalized difference water index (NDWI) and bare soil fraction (BSF). The code is available as open source, and the result is fully reproducible.

References:

Potapov, Hansen, Kommareddy, Kommareddy, Turubanova, Pickens, ... & Ying  (2020). Landsat analysis ready data for global land cover and land cover change mapping. Remote Sensing, 12(3), 426.

Julien, & Sobrino (2019). Optimizing and comparing gap-filling techniques using simulated NDVI time series from remotely sensed global data. International Journal of Applied Earth Observation and Geoinformation, 76, 93-111.

Radeloff, Roy, Wulder, Anderson, Cook, Crawford, ... & Zhu (2024). Need and vision for global medium-resolution Landsat and Sentinel-2 data products. Remote Sensing of Environment, 300, 113918.

How to cite: Consoli, D., Parente, L., and Witjes, M.: A new methodology for time-series reconstruction of global scale historical Earth observation data, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-18061, https://doi.org/10.5194/egusphere-egu24-18061, 2024.