Out-of-sample error estimate for robust M-estimators with convex penalty
Abstract
A generic out-of-sample error estimate is proposed for robust M-estimators regularized with a convex penalty in high-dimensional linear regression where (X,y) is observed and p,n are of the same order. If is the derivative of the robust data-fitting loss , the estimate depends on the observed data only through the quantities = (y-Xβ), X and the derivatives (∂/∂ y) and (∂/∂ y) Xβ for fixed X. The out-of-sample error estimate enjoys a relative error of order n-1/2 in a linear model with Gaussian covariates and independent noise, either non-asymptotically when p/n γ or asymptotically in the high-dimensional asymptotic regime p/nγ'∈(0,∞). General differentiable loss functions are allowed provided that =' is 1-Lipschitz. The validity of the out-of-sample error estimate holds either under a strong convexity assumption, or for the 1-penalized Huber M-estimator if the number of corrupted observations and sparsity of the true β are bounded from above by s*n for some small enough constant s*∈(0,1) independent of n,p. For the square loss and in the absence of corruption in the response, the results additionally yield n-1/2-consistent estimates of the noise variance and of the generalization error. This generalizes, to arbitrary convex penalty, estimates that were previously known for the Lasso.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.