EGU24-18741, updated on 11 Mar 2024
https://doi.org/10.5194/egusphere-egu24-18741
EGU General Assembly 2024
© Author(s) 2024. This work is distributed under
the Creative Commons Attribution 4.0 License.

Retrieving gapless 1-km land surface temperature based on numerical model and auto machine learning approach

Li Yumin, Gao Meiling, and Li Zhenhong
Li Yumin et al.
  • Chang'an University, College of Geological Engineering and Geomatics, China

Land Surface Temperature (LST) is crucial in many areas; but seamless LST data are difficult to obtain due to limitations in thermal infrared sensor technologies. Numerical modeling, which is based on physics-driven process, can simulate continuous spatial and temporal data. Simultaneously, machine learning, a typical data-driven approach, has been effective in remotely-sensed data reconstruction. In this study, we designed a fusion framework that combines the strengths of numerical modeling and machine learning. The framework includes the following steps: 1) Optimization of the numerical model: We use the urbanized High-Resolution Land Data Assimilation System (u-HRLDAS) model. Various spatio-temporal data sources are used to refine and optimize the model's simulations. 2) Database creation for LST reconstruction: This database incorporates forcing variables like 2-meter temperature, relative humidity, air pressure, wind speed, downward longwave and shortwave radiation for the u-HRLDAS model, along with the model's simulated LST outputs. Additional remotely-sensed data such as the Digital Elevation Model (DEM), Normalized Difference Vegetation Index (NDVI), latitude, longitude, land use and cover, and slope are also included. The datasets span the summer months (June to August) from 2011 to 2014. Daily LST data from MOD11A1 and MYD11A1 are used as label data. 3) Optimal model identification via automatic machine learning framework: The MODIS LST data in the database serves as training labels, with a 70/30 split for training and validation. Evaluation metrics like RMSE, MAE, and R² guide the selection. We chose the AutoGluon-Tabular framework, developed by Amazon, for its superior performance, which is achieved through bagging and stacking methods.  Finally, the 1-km seamless LST is reconstructed based on the model with the highest accuracy in validation.

Taking Xi’an city in China as the study region, nine models (Weightensemble_L2, LightGBMLarge, XGBoost, LightGBM, CatBoost, LightGBM, ExtraTree, NeuralNetTorch, and NeuralNetFastAI) were trained within the Autogluon-Tabular framework. These models displayed RMSE values ranging from 0.737 to 1.417 K, MAE spanning 0.522 to 1.031 K, and R² from 0.967 to 0.991. Notably, the Weightedensemble_L2 model excelled, with the lowest RMSE (0.737) and MAE (0.522), and the highest R² (0.991), closely followed by the LightBGMlarge model. with RMSE, MAE, and R² values of 0.739, 0.526, and 0.991, respectively. Furthermore, we conducted supplementary testing using four reserved MODIS LST images. Employing the previously trained WeightedEnsemble_L2 model, seamless predictions of MODIS LST were generated at four overpass timestamps: 02:30, 05:30, 14:30, and 17:30. The resulting spatial distributions is similar with the observed LST, validating our method's capability to capture LST's spatial characteristics and ensure spatial continuity compared to the original MODIS LST data.

In conclusion, the proposed fusion framework which integrates numerical modeling and automatic machine learning, successfully reconstructed LST with high accuracy and strong spatial similarities. There are still shortcomings of this method, such as the predicted images losing some spatial details compared to the observations, which need to be improved in the future.

How to cite: Yumin, L., Meiling, G., and Zhenhong, L.: Retrieving gapless 1-km land surface temperature based on numerical model and auto machine learning approach, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-18741, https://doi.org/10.5194/egusphere-egu24-18741, 2024.