Robust statistical methods and the credibility movement of psychological science.
Martina Sladekova, Andy P Field
Abstract
Open AccessThe general linear model (GLM) is the most frequently applied family of statistical models in psychology. Within the GLM, the effects under study are estimated using the ordinary least squares (OLS) estimation. In certain situations, OLS produces parameter estimates that are unbiased and optimal (with least possible error) and hypothesis tests that retain the expected rate of false positives (Type I errors). This happens when (1) outliers and influential cases are absent, and (2) assumptions of linearity and additivity, spherical errors, and normal errors are met. This paper first provides a technical description of OLS and an overview of its statistical assumptions. We then discuss the methods commonly employed to detect and address violations of assumptions, and how the current application of these methods can compromise the reproducibility of findings by allowing too many data-driven decisions to be made as part of the data analytic pipeline. We briefly introduce several robust estimation methods-namely bootstrapping, heteroscedasticity-consistent standard errors, M-estimators, and trimming-that can improve the accuracy of parameter estimates and the power of statistical tests. We provide guidance on how these methods can be used to transparently preregister a sensitivity analysis, reducing the opportunity for problematic researcher degrees of freedom to enter the analytic pipeline.