- Dept. of Environmental Engineering, Korea National University of Transportation, Chungju-si, Korea, Republic of
Due to the effects of global climate change, Korea is experiencing intensified heavy rainfall, leading a rapid runoff along with large amount of non-point source pollution flow into water bodies, posing a significant threat to the aquatic ecosystem and water supply system. Especially, stagnant water bodies are highly susceptible to eutrophication due to these pollutants, and warm water temperatures further cause the occurrence of the algal blooms. As such, developing accurate spatio-temporal monitoring of water quality parameters (WQPs) over water body become essential. In Korea, algae-related WQPs such as chlorophyll-a (Chl-a) and microcystin are mainly monitored using fixed observation stations. Korea operates 76 automated monitoring stations, but these methods have limitations in capturing the spatial distribution of WQPs. Furthermore, manual stations collect data once a week, resulting in the lack of spatio-temporal continuity.
In this study, we developed and validated a Chl-a estimation model based on Random Forest Regression over Daecheong Lake using surface reflectance data from the Geostationary Ocean Color Imager-II (GOCI-II) onboard the Geo-Kompsat-2B (GK-2B) satellite. GOCI-II provides surface reflectance eight times per day at a spatial resolution of 250 m. The point observation data consisted of hourly Chl-a concentrations obtained from the Korean Water Environment Information System. The study period spanned three years (January 2021 to December 2023). For model development, the dataset was randomly divided into a 7:3 ratio for training and testing. The model's input variables included the spectral bands of GK-2B GOCI-II, the normalized difference chlorophyll index, the normalized fluorescence height index, and the fluorescence line height. The dependent variable was the log-transformed Chl-a data. Furthermore, the study assessed the model's efficiency by sequentially removing input variables based on their feature importance rankings.
As a result, the statistically optimal combination of input variables included all seven variables. The model's performance showed bias, Root Mean Square Error, and Correlation Coefficient values of -0.0041 ppb, 0.1649 ppb, and 0.91, respectively. Despite these favorable statistical results, uncertainties were observed during periods of extremely low or high Chl-a concentrations. Finally, the spatial distribution of Chl-a was estimated using the developed model demonstrated a clear spatial pattern with seasonal variations. However, uncertainties were evident at the boundaries between water bodies and land surfaces. These uncertainties likely arose due to the limited spatial resolution of 250 m, which was insufficient for capturing narrow lake widths.
Future studies should address these limitations, focusing on spatial downscaling of surface reflectance to reduce boundary-related uncertainties, as well as minimizing the underestimation/overestimation of extreme Chl-a values through establishing seperate training based on the seasonal characteristics or temporal behavior of hydrometeorological variables.
Acknowledgement: This work was supported by the National Research Foundation of Korea(NRF) grant funded by the Korea government(MSIT)(RS-2024-00416443).
How to cite: Park, K. and Park, J.: Development and validation of chlorophyll-a estimation model using GOCI-II land surface reflectance and machine learning at Daecheong Lake in South Korea, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-14300, https://doi.org/10.5194/egusphere-egu25-14300, 2025.