Near-Optimal (Euclidean) Metric Compression
Abstract
The metric sketching problem is defined as follows. Given a metric on n points, and ε>0, we wish to produce a small size data structure (sketch) that, given any pair of point indices, recovers the distance between the points up to a 1+ε distortion. In this paper we consider metrics induced by 2 and 1 norms whose spread (the ratio of the diameter to the closest pair distance) is bounded by >0. A well-known dimensionality reduction theorem due to Johnson and Lindenstrauss yields a sketch of size O(ε-2 ( n) n n), i.e., O(ε-2 ( n) n) bits per point. We show that this bound is not optimal, and can be substantially improved to O(ε-2(1/ε) · n + ) bits per point. Furthermore, we show that our bound is tight up to a factor of (1/ε). We also consider sketching of general metrics and provide a sketch of size O(n(1/ε)+ ) bits per point, which we show is optimal.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.