Metric embedding with outliers
Abstract
We initiate the study of metric embeddings with outliers. Given some metric space (X,) we wish to find a small set of outlier points K ⊂ X and either an isometric or a low-distortion embedding of (X K,) into some target metric space. This is a natural problem that captures scenarios where a small fraction of points in the input corresponds to noise. For the case of isometric embeddings we derive polynomial-time approximation algorithms for minimizing the number of outliers when the target space is an ultrametric, a tree metric, or constant-dimensional Euclidean space. The approximation factors are 3, 4 and 2, respectively. For the case of embedding into an ultrametric or tree metric, we further improve the running time to O(n2) for an n-point input metric space, which is optimal. We complement these upper bounds by showing that outlier embedding into ultrametrics, trees, and d-dimensional Euclidean space for any d≥ 2 are all NP-hard, as well as NP-hard to approximate within a factor better than 2 assuming the Unique Game Conjecture. For the case of non-isometries we consider embeddings with small ∞ distortion. We present polynomial-time bi-criteria approximation algorithms. Specifically, given some ε > 0, let kε denote the minimum number of outliers required to obtain an embedding with distortion ε. For the case of embedding into ultrametrics we obtain a polynomial-time algorithm which computes a set of at most 3kε outliers and an embedding of the remaining points into an ultrametric with distortion O(ε n). For embedding a metric of unit diameter into constant-dimensional Euclidean space we present a polynomial-time algorithm which computes a set of at most 2kε outliers and an embedding of the remaining points with distortion O(ε).
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.