EGU24-14375, updated on 09 Mar 2024
https://doi.org/10.5194/egusphere-egu24-14375
EGU General Assembly 2024
© Author(s) 2024. This work is distributed under
the Creative Commons Attribution 4.0 License.

Hybrid Deep Learning Approach for Simultaneous Feature Engineering and Explanation of Water Quality Sensor Data 

Jihoon Shin, YoungWoo Kim, Taeseung Park, and YoonKyung Cha
Jihoon Shin et al.
  • Department of Environmental Engineering, University of Seoul

Water quality monitoring plays a crucial role in establishing effective management strategies for ensuring the safety and sustainability of water resources. Recently, with advances in sensor technologies, autonomous water quality monitoring has been increasingly used to obtain detailed temporal variations of water quality in the river network. However, irregular time series data are prevalent in multi-sensor monitoring systems and the resulting missing values limit the ability of data to serve as a decision basis. A modeling tool that can efficiently handle irregular time series data is required to derive useful insights from the sensor data. The combined use of feature engineering and attention mechanisms has shown benefits in dealing with irregular time series data from improved performance and explainability. In this study, hybrid deep learning that incorporates reverse time attention and trainable decay mechanisms (RETAIN-D) was used to analyze sensor data collected from multiple sites located in the upper section of the Geum River, South Korea. RETAIN-D was developed to predict the variations in the level of chlorophyll-a (Chl-a) concentrations and to analyze spatiotemporal associations between its influencing factors at different monitoring sites. RETAIN-D showed a high degree of accuracy (Accuracy = 0.81–0.90, AUC = 0.67–0.90, F1 score = 0.87–0.88 for test sets) for various chlorophyll-a standards. Trainable decay mechanism in RETAIN-D allowed predictions of Chl-a level in missing periods without manual feature engineering. Chl-a concentrations from the nearest adjacent tributary had high importance in predicting Chl-a levels for the target site. The contribution of input features among different time steps was generally higher in the recent time steps. These results demonstrate the usefulness of the hybrid deep learning approach as an efficient Big Data analysis tool for water quality and resource management.

How to cite: Shin, J., Kim, Y., Park, T., and Cha, Y.: Hybrid Deep Learning Approach for Simultaneous Feature Engineering and Explanation of Water Quality Sensor Data , EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-14375, https://doi.org/10.5194/egusphere-egu24-14375, 2024.