Optimal compression of approximate inner products and dimension reduction
Abstract
Let X be a set of n points of norm at most 1 in the Euclidean space Rk, and suppose >0. An -distance sketch for X is a data structure that, given any two points of X enables one to recover the square of the (Euclidean) distance between them up to an additive error of . Let f(n,k,) denote the minimum possible number of bits of such a sketch. Here we determine f(n,k,) up to a constant factor for all n ≥ k ≥ 1 and all ≥ 1n0.49. Our proof is algorithmic, and provides an efficient algorithm for computing a sketch of size O(f(n,k,)/n) for each point, so that the square of the distance between any two points can be computed from their sketches up to an additive error of in time linear in the length of the sketches. We also discuss the case of smaller >2/ n and obtain some new results about dimension reduction in this range. In particular, we show that for any such and any k ≤ t= (2+2 n)2 there are configurations of n points in Rk that cannot be embedded in R for < ck with c a small absolute positive constant, without distorting some inner products (and distances) by more than . On the positive side, we provide a randomized polynomial time algorithm for a bipartite variant of the Johnson-Lindenstrauss lemma in which scalar products are approximated up to an additive error of at most . This variant allows a reduction of the dimension down to O( (2+2 n)2), where n is the number of points.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.