Simultaneous estimation of the mean and the variance in heteroscedastic Gaussian regression
Abstract
Let Y be a Gaussian vector of Rn of mean s and diagonal covariance matrix . Our aim is to estimate both s and the entries σi=i,i, for i=1,...,n, on the basis of the observation of two independent copies of Y. Our approach is free of any prior assumption on s but requires that we know some upper bound γ on the ratio iσi/iσi. For example, the choice γ=1 corresponds to the homoscedastic case where the components of Y are assumed to have common (unknown) variance. In the opposite, the choice γ>1 corresponds to the heteroscedastic case where the variances of the components of Y are allowed to vary within some range. Our estimation strategy is based on model selection. We consider a family \Sm×m, m∈M\ of parameter sets where Sm and m are linear spaces. To each m∈M, we associate a pair of estimators (sm,σm) of (s,σ) with values in Sm×m. Then we design a model selection procedure in view of selecting some m among M in such a way that the Kullback risk of (sm,σm) is as close as possible to the minimum of the Kullback risks among the family of estimators \(sm,σm), m∈M\. Then we derive uniform rates of convergence for the estimator (sm,σm) over H\"olderian balls. Finally, we carry out a simulation study in order to illustrate the performances of our estimators in practice.