ON VALIDATING REGRESSION MODELS WITH BOOTSTRAPS AND DATA SPLITTING TECHNIQUES
Model validity is the stability and reasonableness of the regression coefficients, the plausibility and usability of the regression function and ability to generalize inference drawn from the regression analysis. Model validation is an important step in the modeling process and helps in assessing the reliability of models before they can be used in decision making. This research work therefore seeks to study regression model validation process by bootstrapping approach and data splitting techniques. We review regression model validation by comparing predictive index accuracy of data splitting techniques and residual resampling bootstraps. Various validation statistic such as the mean square error (MSE), Mallow’s cp and R2 were used as criteria for selecting the best model and the best selection procedure for each data set. The study shows that bootstrap provides the most precise estimate of R2 which reduce the risk over fitted models than in data splitting techniques.