- University of Kansas, Lawrence, United States of America (kipnielsen@ku.edu)
Traditional numerical weather models struggle with predicting the planetary boundary layer (PBL) in urban areas and during the morning transition. Accurately predicting this part of the atmosphere is crucial because of its downstream impacts, such as accurate air quality forecasts and the representation of convection-based processes. The rapid growth of information technology has increased the capability of machine learning, but several limitations and sensitivities arise when using it to predict the PBL. In this work, aircraft observations from the Aircraft Meteorological Data Relay Program are compiled into half-hourly temperature profiles of the PBL. Dallas-Fort Worth, TX, USA is chosen because it is far from topographical and coastal influences, and there are two large airports near the center of the city. Profiles are compiled into daily bins of five half-hourly profiles prior to and including sunrise as the inputs and eight half-hourly profiles after sunrise as the outputs. This provides the opportunity to test the performance of machine learning models under a variety of stability classifications and PBL heights. To determine the sensitivity of the model configuration, five machine learning model types are tested, learning rates from 0.01 to 0.000001, various training epochs, the order of the layers, the number of neurons in each layer, and eight optimizers. At the start, mean square error (MSE) is used as the loss function to find the optimal model configuration. However, standard summary statistics may not produce larger errors when the physically more important parts of the PBL are astray, such as near the surface and the inversion at the top of the PBL. To test the sensitivity of the loss function, MSE and correlation coefficient are used to gauge the performance of using loss functions of MSE, mean absolute error, Huber loss, and the logarithm of the hyperbolic cosine, in addition to several custom weighted profiles that place higher weights at different parts of the PBL. The optimal model configuration found using MSE as the loss function is a long-short term memory network layer with 2,000 nodes followed by two dense layers with 1,000 nodes, a learning rate of 0.0001, 100 epochs, and an AdamW optimizer, which had an overall MSE of 0.882°C. The MSE was larger for predictions further after sunrise, and the model generally underestimated (overestimated) the onset of mixing and near-surface temperature in the summer (winter). Mean absolute error was the most accurate loss function with an overall MSE of 0.538°C and a correlation coefficient of 0.958. The work shown here highlights the importance in methodically testing various machine learning configurations to back out the sensitivity of the model, which can influence the confidence of the conclusions. It also shows the potential of using a simple machine learning model to produce rapidly updated short-term weather forecasts that can be used in conjunction with traditional numerical weather models.
How to cite: Nielsen, K. and Rahn, D.: The Sensitivity of Machine Learning Configuration for Predicting Temperature Profiles in the Planetary Boundary Layer, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-16030, https://doi.org/10.5194/egusphere-egu26-16030, 2026.