Robust Mean Estimation on Highly Incomplete Data with Arbitrary Outliers

Omer Reingold

Robust Mean Estimation on Highly Incomplete Data with Arbitrary Outliers

Abstract

We study the problem of robustly estimating the mean of a d-dimensional distribution given N examples, where most coordinates of every example may be missing and N examples may be arbitrarily corrupted. Assuming each coordinate appears in a constant factor more than N examples, we show algorithms that estimate the mean of the distribution with information-theoretically optimal dimension-independent error guarantees in nearly-linear time O(Nd). Our results extend recent work on computationally-efficient robust estimation to a more widely applicable incomplete-data setting.

0

Discussion (0)

Sign in to join the discussion.

Loading comments…