- 1Chinese Academy of Sciences, Institute of Atmospheric Physics, Beijing, China (zhaowei18@mails.ucas.ac.cn)
- 2Chinese Academy of Sciences, Institute of Atmospheric Physics, Beijing, China (wangyinan@mail.iap.ac.cn)
- 3Institute of Urban Meteorology, Chinese Meteorological Administration, Beijing, China (ybpan@ium.cn)
Accurate cloud base height (CBH) over the Tibetan Plateau—Earth's Third Pole—is essential for constraining Asian monsoon dynamics, glacial melt projections, and water security, affecting 1.9 billion people downstream. However, ERA5 reanalysis systematically underestimates CBH by up to 5.20 km in southern regions, propagating errors into climate models and hydrological forecasts. Here, we present a two-step machine learning framework that progressively eliminates this hidden bias. Step 1 refines the ERA5 retrieval algorithm using three years of ground-based lidar observations (October 2021–December 2024), reducing the site-level mean bias error from 1.8 km to 0.1 km and improving the regional correlation with CALIPSO from 0.25 to 0.40. Step 2 applies an Optuna-optimized XGBoost model trained on high-confidence CALIPSO observations (N=106,718), fusing the refined ERA5 data with vertical atmospheric profiles and surface attributes. The final product achieved a test-set RMSE of 1.87 km (R²=0.71, MBE=−0.02 km), with seasonal correlations reaching 0.72–0.86 and southern plateau bias reduced from −5.20 km to −0.11 km, a 97.9% improvement. This scalable approach enables reliable, long-term CBH reconstruction, which is critical for advancing climate model parameterizations and water resource assessments across High Mountain Asia.
How to cite: Zhao, W., Wang, Y., and Pan, Y.: Machine Learning Reveals Hidden Bias in ERA5 Cloud Heights Over Earth's Third Pole, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-2474, https://doi.org/10.5194/egusphere-egu26-2474, 2026.