Estimator selection in the Gaussian setting
Abstract
We consider the problem of estimating the mean f of a Gaussian vector Y with independent components of common unknown variance σ2. Our estimation procedure is based on estimator selection. More precisely, we start with an arbitrary and possibly infinite collection of estimators of f based on Y and, with the same data Y, aim at selecting an estimator among with the smallest Euclidean risk. No assumptions on the estimators are made and their dependencies with respect to Y may be unknown. We establish a non-asymptotic risk bound for the selected estimator. As particular cases, our approach allows to handle the problems of aggregation and model selection as well as those of choosing a window and a kernel for estimating a regression function, or tuning the parameter involved in a penalized criterion. We also derive oracle-type inequalities when consists of linear estimators. For illustration, we carry out two simulation studies. One aims at comparing our procedure to cross-validation for choosing a tuning parameter. The other shows how to implement our approach to solve the problem of variable selection in practice.