SC36 Machine Learning Algorithms for soil science data: R tutorial |
Convener: Tomislav Hengl |
Thu, 27 Apr, 19:00–20:00
|
There is an increasing interest in using Machine Learning Algorithms for analysing soil science data: e.g. to generate spatial predictions, to fit pedo-transfer functions and for soil data mining. Machine Learning Algorithms such as random forests, neural networks, Support Vector Machines, Deep Learning and similar, have already shown predictive potential for soil mapping purpose (see e.g. work of Moran and Bui (2002), Henderson et al. (2005) and Ahmad et al (2010)). MLA can also be used to fit pedo-transfer functions to predict e.g. bulk density for a given combination of texture, soil organic carbon, diagnostic horizons etc; and to translate soil classes from one classification to the other. During the course we will use Open Source packages such as the ranger (A Fast Implementation of Random Forests), xgboost (Extreme Gradient Boosting), h2o (open-source software for big-data analysis), randomForestSRC (Random Forests for Survival, Regression and Classification) and similar. Special focus will be put on running analysis using MLA with big data i.e. on setting up a fully parallelized system and on carefully planning data processing so that the results can be generated within a realistic timeframe. The lecturers are experienced R developers and the frequent users of Machine Learning Algorithms for processing soil and environmental data.