SC5.2 | Improving statistical evaluations in the geosciences
Improving statistical evaluations in the geosciences
Co-organized by AS6/ESSI3/GM12/NH12/SSP5
Convener: Bernard Ludwig | Co-conveners: Isabel GreenbergECSECS, Anna GuninaECSECS
Thu, 27 Apr, 10:45–12:30 (CEST)
Room -2.61/62
Thu, 10:45
Almost all scientific studies rely to some extent on correct statistical analyses. While statistical software packages for scientists offer great opportunities and provide many powerful tools (e.g., in data mining and exploratory statistics), there are many pitfalls, which may result in wrong or nonreproducible manuscripts. This problem has been known for a long time and has been addressed explicitly in some research fields other than the geosciences. This short course aims to address potential problems in geoscientific studies and to reduce the number of non-reproducible studies.

A. Fundamental issues in design of experiments and statistical analyses
The following fundamental issues will be addressed:
• Time spent for experimental designs. Advantages and disadvantages of selected experimental designs. Missing randomization. Observational study vs. controlled experiments
• Pseudo-replication vs. true replications and how to deal with it. Wrong model formulations
• “Obsession” with p values: Statistical significance and geoscientific relevance
• Statistical tests: conditions for the application of modelling and hypothesis testing
• Dealing with suspected outliers
• Logistic vs. linear regression
• Number of experimental treatments vs. power of tests. Number of replicates required for predictive modelling
• Use and misuse of correlation analyses
• Investigating and dealing with interactions between factors or predictors

B. Selected advanced issues in geoscientific studies
The following topics will be addressed:
• Validation or cross-validation instead of a sole focus on calibration.
• Model types
• Use of contrasts instead of multiple mean testing
• Different experimental designs – completely randomized (CRD), randomized complete block (RCBD), Latin square (LSD), balanced incomplete bock (BIBD), and split plot design
• RCBD with one treatment factor: analysis of variance and mixed effects model
• Blocked observational study with one predictor: multiple linear regression and mixed effects model
• CRD, RCBD, LSD, split plot design and BIBD: advantages, disadvantages, equations and modelling
• Analysing nested (multi-stratum) designs

Examples will be shown using the programming languages R and SAS