Bulk Johnson-Lindenstrauss Lemmas
Abstract
For a set X of N points in RD, the Johnson-Lindenstrauss lemma provides random linear maps that approximately preserve all pairwise distances in X -- up to multiplicative error (1 ε) with high probability -- using a target dimension of O(ε-2(N)). Certain known point sets actually require a target dimension this large -- any smaller dimension forces at least one distance to be stretched or compressed too much. What happens to the remaining distances? If we only allow a fraction η of the distances to be distorted beyond tolerance (1 ε), we show a target dimension of O(ε-2(4e/η)(N)/R) is sufficient for the remaining distances. With the stable rank of a matrix A as AF2/A2, the parameter R is the minimal stable rank over certain (N) sized subsets of X-X or their unit normalized versions, involving each point of X exactly once. The linear maps may be taken as random matrices with i.i.d. zero-mean unit-variance sub-gaussian entries. When the data is sampled i.i.d. as a given random vector , refined statements are provided; the most improvement happens when or the unit normalized -' is isotropic, with ' an independent copy of , and includes the case of i.i.d. coordinates.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.