The restricted consistency property of leave-nv-out cross-validation for high-dimensional variable selection

Abstract

Cross-validation (CV) methods are popular for selecting the tuning parameter in the high-dimensional variable selection problem. We show the mis-alignment of the CV is one possible reason of its over-selection behavior. To fix this issue, we propose a version of leave-nv-out cross-validation (CV(nv)), for selecting the optimal model among the restricted candidate model set for high-dimensional generalized linear models. By using the same candidate model sequence and a proper order of construction sample size nc in each CV split, CV(nv) avoids the potential hurdles in developing theoretical properties. CV(nv) is shown to enjoy the restricted model selection consistency property under mild conditions. Extensive simulations and real data analysis support the theoretical results and demonstrate the performances of CV(nv) in terms of both model selection and prediction.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…