Robust Gaussian Covariance Estimation in Nearly-Matrix Multiplication Time

Abstract

Robust covariance estimation is the following, well-studied problem in high dimensional statistics: given N samples from a d-dimensional Gaussian N(0, ), but where an -fraction of the samples have been arbitrarily corrupted, output minimizing the total variation distance between N(0, ) and N(0, ). This corresponds to learning in a natural affine-invariant variant of the Frobenius norm known as the Mahalanobis norm. Previous work of Cheng et al demonstrated an algorithm that, given N = (d2 / 2) samples, achieved a near-optimal error of O( 1 / ), and moreover, their algorithm ran in time O(T(N, d) / poly ()), where T(N, d) is the time it takes to multiply a d × N matrix by its transpose, and is the condition number of . When is relatively small, their polynomial dependence on 1/ in the runtime is prohibitively large. In this paper, we demonstrate a novel algorithm that achieves the same statistical guarantees, but which runs in time O (T(N, d) ). In particular, our runtime has no dependence on . When is reasonably conditioned, our runtime matches that of the fastest algorithm for covariance estimation without outliers, up to poly-logarithmic factors, showing that we can get robustness essentially "for free."

0

Discussion (0)

Sign in to join the discussion.

Loading comments…