Focused Relative Risk Information Criterion for Variable Selection in Linear Regression
Abstract
This paper motivates and develops a novel and focused approach to variable selection in linear regression models. For estimating the regression mean μ=\,(Y x0), for the covariate vector of a given individual, there is a list of competing estimators, say μS for each submodel S. Exact expressions are found for the relative mean squared error risks, when compared to the widest model available, say S/. The theory of confidence distributions is used for accurate assessments of these relative risks. This leads to certain Focused Relative Risk Information Criterion scores, and associated FRIC plots and FRIC tables, as well as to Confidence plots to exhibit the confidence the data give in the submodels. The machinery is extended to handle many focus parameters at the same time, with appropriate averaged FRIC scores. The particular case where all available covariate vectors have equal importance yields a new overall criterion for variable selection, balancing complexity and fit in a natural fashion. A connection to the Mallows criterion is demonstrated, leading also to natural modifications of the latter. The FRIC and AFRIC strategies are illustrated for real data.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.