Stability Regularized Cross-Validation

Andrés Gómez

Stability Regularized Cross-Validation

Abstract

We revisit the problem of ensuring strong test set performance via cross-validation, and propose a nested k-fold cross-validation scheme that selects hyperparameters by minimizing a weighted sum of the usual cross-validation metric and an empirical model-stability measure. The weight on the stability term is itself chosen via a nested cross-validation procedure. This reduces the risk of strong validation set performance and poor test set performance due to instability. We benchmark our procedure on a suite of 13 real-world datasets, and find that, compared to k-fold cross-validation over the same hyperparameters, it improves the out-of-sample MSE for sparse ridge regression and CART by 4\% and 2\% respectively on average, but has no impact on XGBoost. It also reduces the user's out-of-sample disappointment, sometimes significantly. For instance, for sparse ridge regression, the nested k-fold cross-validation error is on average 0.9\% lower than the test set error, while the k-fold cross-validation error is 21.8\% lower than the test error. Thus, for unstable models such as sparse regression and CART, our approach improves test set performance and reduces out-of-sample disappointment.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Or compile a full topic from this idea

Discussion (0)

Sign in to join the discussion.

Loading comments…