EGU2020-22587
https://doi.org/10.5194/egusphere-egu2020-22587
EGU General Assembly 2020
© Author(s) 2020. This work is distributed under
the Creative Commons Attribution 4.0 License.

Deep neural networks for total organic carbon prediction and data-driven sampling

Everardo González Ávalos and Ewa Burwicz
Everardo González Ávalos and Ewa Burwicz
  • GEOMAR Helmholtz Centre for Ocean Research, Kiel, Germany (egonzalez@geomar.de)

Over the past decade deep learning has been used to solve a wide array of regression and classification tasks. Compared to classical machine learning approaches (k-Nearest Neighbours, Random Forests,… ) deep learning algorithms excel at learning complex, non-linear internal representations in part due to the highly over-parametrised nature of their underling models; thus, this advantage often comes at the cost of interpretability. In this work we used deep neural network to construct global total organic carbon (TOC) seafloor concentration map. Implementing Softmax distributions on implicitly continuous data (regression tasks) we were able to obtain probability distributions to asses prediction reliability. A variation of Dropout called Monte Carlo Dropout is also used during the inference step providing a tool to model prediction uncertainties. We used these techniques to create a model information map which is a key element to develop new data-driven sampling strategies for data acquisition. 

How to cite: González Ávalos, E. and Burwicz, E.: Deep neural networks for total organic carbon prediction and data-driven sampling, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-22587, https://doi.org/10.5194/egusphere-egu2020-22587, 2020

Displays

Display file

Comments on the display

AC: Author Comment | CC: Community Comment | Report abuse

displays version 1 – uploaded on 04 May 2020
  • CC1: Comment on EGU2020-22587, Jens Greinert, 05 May 2020

    Hola Everardo, looks like a nice presentation I am looking forward to seeing you chat about it tomorrow.

  • CC2: Comment on EGU2020-22587, Jens Klump, 06 May 2020

    Hi Everardo, I really enjoyed going through your presentation.

    Have you considered to benchmark you uncertainty quantification against those given by "classical" methods? We did that for random forest, which had given us some unexpected results.

    Fouedjio, F., & Klump, J. (2019). Exploring prediction uncertainty of spatial data in geostatistical and machine learning approaches. Environmental Earth Sciences, 78(1), 38. https://doi.org/10.1007/s12665-018-8032-z

    • AC1: Reply to CC2, Everardo González Ávalos, 07 May 2020

      Hello Jens,

      Random Forests and MonteCarlo Dropout inference do share a lot in common. It is certainly a worthwhile benchmark we could implement once our model performance is satisfactory. Do you have any experience using RF with a large number of inputs (~500 in our case)?

      • CC3: Reply to AC1, Jens Klump, 07 May 2020

        In the study cited above, we used ~600 locations. We then compared RF against Kriging, mainly because Kriging is well understood in mineral exploration and can be seen as a benchmark.