Ida Marie Arcinas Alcantara

Western Washington University
, Online - Zoom

Abstract

Statistical Properties and Applications of PRESS statistic

The most popularly used statistic \(R^2\) has a fundamental weakness in model building: it favors adding more predictors to the model because \(R^2\) can only increase. In effect, the additional predictors start fitting the noise in data. Other criterion in selecting a regression model such as \(R^2_{adj}\), AIC, SBC, and Mallow's \(C_p\), does not guarantee the model selected will also make better prediction of future values. To avoid this, data scientists withhold a percentage of the data for validation purposes. The PRESS statistic does something similar by withholding each observation in calculating its own predicted value. We investigated the properties of PRESS statistic and explored how it performs compared to other criterion in model selection. We also

derived estimators of the parameters of interest in linear regression that is based on PRESS, while maintaining desirable statistical properties of estimators such as unbiasedness. A diagnostic statistic that looks at the impact of deleting one observation from the estimation of MSE is also presented.