Deep learning to support ocean data quality control
- Alfred-Wegener-Institut , Geosciences, Germany (mohamed.chouai@awi.de)
In this study, which is part of the M-VRE [https://mosaic-vre.org/about] project, we aim to improve a quality control (QC) system on arctic ocean temperature profile data using deep learning. For the training, validation, and evaluation of our algorithms, we are using the UDASH dataset [https://essd.copernicus.org/articles/10/1119/2018/]. In the classical QC setting, the ocean expert or "operator", applies a series of thresholding (classical) algorithms to identify, i.e. flag, erroneous data. In the next step, the operator visually inspects every data profile, where suspicious samples have been identified. The goal of this time-consuming visual QC is to find "false positives", i.e. flagged data that is actually good, because every sample/profile has not only a scientific value but also a monetary one. Finally, the operator turns all "false positive" data back to good. The crucial point here is that although these samples/profiles are above certain thresholds they are considered good by the ocean expert. These human expert decisions are extremely difficult, if not impossible, to map by classical algorithms. However, deep-learning neural networks have the potential to learn complex human behavior. Therefore, we have trained a deep learning system to "learn" exactly the expert behavior of finding "false positives" (identified by the classic thresholds), which can be turned back to good accordingly. The first results are promising. In a fully automated setting, deep learning improves the results and fewer data are flagged. In a subsequent visual QC setting, deep learning relieves the expert with a distinct workload reduction and gives the option to clearly increase the quality of the data.
Our long-term goal is to develop an arctic quality control system as a series of web services and Jupyter notebooks to apply automated and visual QC online, efficient, consistent, reproducible, and interactively.
How to cite: Chouai, M., Simon Reimers, F., and Mieruch-Schnülle, S.: Deep learning to support ocean data quality control , EGU General Assembly 2023, Vienna, Austria, 24–28 Apr 2023, EGU23-15185, https://doi.org/10.5194/egusphere-egu23-15185, 2023.