Convex programming approach to robust estimation of a multivariate Gaussian model

Abstract

Multivariate Gaussian is often used as a first approximation to the distribution of high-dimensional data. Determining the parameters of this distribution under various constraints is a widely studied problem in statistics, and is often considered as a prototype for testing new algorithms or theoretical frameworks. In this paper, we develop a nonasymptotic approach to the problem of estimating the parameters of a multivariate Gaussian distribution when data are corrupted by outliers. We propose an estimator---efficiently computable by solving a convex program---that robustly estimates the population mean and the population covariance matrix even when the sample contains a significant proportion of outliers. Our estimator of the corruption matrix is provably rate optimal simultaneously for the entry-wise 1-norm, the Frobenius norm and the mixed 2/1 norm. Furthermore, this optimality is achieved by a penalized square-root-of-least-squares method with a universal tuning parameter (calibrating the strength of the penalization). These results are partly extended to the case where p is potentially larger than n, under the additional condition that the inverse covariance matrix is sparse.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…