How can we quantify, explain and apply the uncertainty of complex soil maps predicted with neural networks?
- 1Department of Geoscience, University of Tübingen, Rümelinstraße 19-23, 72074 Tübingen, Baden-Württemberg, Germany
- 2Department of Computer Science, University of Tübingen, Maria-von-Linden-Straße 6, 72076 Tübingen, Baden-Württemberg, Germany
- 3Cluster of Excellence Machine Learning, New Perspectives for Science, University of Tübingen, Maria-von-Linden-Straße 6, 72076 Tübingen, Baden-Württemberg, Germany
Artificial neural networks (ANN), which are mainly used in pattern and image recognition, have now found a wide range of applications in soil science and geoscience. They have proven to be a useful tool for complex questions that also involve a large amount of data, for example prediction of soil classes or soil properties on various scales. However, we face two main challenges when applying ANN: In their basic form, deep-learning algorithms do not provide interpretable predictive uncertainty. Thus, in geosciences and in particular in soil science, they have been used more as black-box models and properties of a machine learning model such as the certainty and plausibility of the predicted variables, for example soil classes, were interpretation by experts rather than quantified by metrics validating the ANN. In most cases regression coefficients or comparable statistical measure are reported for the overall performance of the model. This leads to the second challenge, that is that these algorithms have high confidence of their predictions in areas far away from the training area or in areas where they receive only little information from a small number of data points.
In order to gain a better understanding of these aforementioned properties, we implement in our explorative study on soil classification a Bayesian deep learning approach (i.e., a method to add uncertainty to deep networks) known as last layer Laplace approximation. This is a technique that can be applied as a post-hoc "add-on" without destroying the otherwise good performance of deep classifiers. It helps us to correct the overconfident areas without reducing the accuracy of our prediction, giving us a more realistic uncertainty expression of the model's prediction.
Our predictor variable soil type provides us with a large amount of complex information about soil processes and properties, which is a great advantage since it would take a lot of time and money to collect all this information individually. At the same time, soil maps are in high demand by authorities, construction companies or farmers. In our study area around Tübingen in southern Germany, there are 41 different soil types, determined according to the German soil classification, sub divisible into typical soils of the Neckar and Ammer valleys, the Swabian Jura and Black Forest, and non-area related soil. In addition to the underlying soil map, remotely sensed variables, a digital elevation model and its derivatives are used as input to the ANN, which is designed to learn the relationship between these and the soil type. As a test case, we then explicitly exclude the Swabian Jura and Black Forest in the training area but include them as prediction regions. Both regions are characterized by very different soil types compared to the rest of the study area due to their considerably different geology, climate, and terrain. Our goal is then to enrich soil type maps with a structured uncertainty to better understand the causality of machine learning models in soil science and their transferability to regions other than the training and validation area.
How to cite: Rau, K., Gläßle, T., Hennig, P., and Scholten, T.: How can we quantify, explain and apply the uncertainty of complex soil maps predicted with neural networks?, EGU General Assembly 2023, Vienna, Austria, 24–28 Apr 2023, EGU23-689, https://doi.org/10.5194/egusphere-egu23-689, 2023.