EGU General Assembly 2023
© Author(s) 2023. This work is distributed under
the Creative Commons Attribution 4.0 License.

Machine learning-based emulation of land cover effects at sub-hectometric scale using crowd-sourced weather observations

Andrei Covaci1, Thomas Vergauwen2, Sara Top3, Steven Caluwaerts3,2, and Lesley De Cruz1,4
Andrei Covaci et al.
  • 1Electronics and Informatics Department, Vrije Universiteit Brussel, Brussels, Belgium
  • 2Meteorological and Climatological research, Royal Meteorological Institute of Belgium, Brussels, Belgium
  • 3Department of Physics and Astronomy, Ghent University, Ghent, Belgium
  • 4Observations, Royal Meteorological Institute of Belgium, Brussels, Belgium

Traditional weather stations monitor the weather above short grass, which is a standardized environment. Such an environment is far from representative of where most people live. Moreover, despite advances in urban climate modelling, even state-of-the-art weather forecasts and climate scenarios do not account for the hyperlocal influence of land cover on meteorological variables.

To bridge this gap, we have constructed several machine learning models to translate 2-meter temperature measurements from standardized to different rural and urban environments. The input features of these models are the land cover fractions: impervious, green and water around a target station, and the interpolated open-field 2-meter temperature and wind values at the target location. The target feature for these models is the temperature data from the Flemish crowd-sourced VLINDER-network, which consists of calibrated stations positioned in unconventional locations. These models were trained on data from a limited set of VLINDER-stations and evaluated on unseen data of previously used and unused VLINDER-stations. We found that a random forest model yields the best results and had the highest interpretability of how the features interacted with the model. The results of the simple artificial neural networks are not robust, making these models less reliable.

We explore the addition of more features related to the urban environment such as building height, sky view factor and variables related to radiation. Finally, we investigate how to prevent possible overfitting due to insufficient variation in the land cover in the training data by including other data sources.

How to cite: Covaci, A., Vergauwen, T., Top, S., Caluwaerts, S., and De Cruz, L.: Machine learning-based emulation of land cover effects at sub-hectometric scale using crowd-sourced weather observations, EGU General Assembly 2023, Vienna, Austria, 24–28 Apr 2023, EGU23-17349,, 2023.

Supplementary materials

Supplementary material file