Spectral Approaches to Nearest Neighbor Search

Abstract

We study spectral algorithms for the high-dimensional Nearest Neighbor Search problem (NNS). In particular, we consider a semi-random setting where a dataset P in Rd is chosen arbitrarily from an unknown subspace of low dimension k d, and then perturbed by fully d-dimensional Gaussian noise. We design spectral NNS algorithms whose query time depends polynomially on d and n (where n=|P|) for large ranges of k, d and n. Our algorithms use a repeated computation of the top PCA vector/subspace, and are effective even when the random-noise magnitude is much larger than the interpoint distances in P. Our motivation is that in practice, a number of spectral NNS algorithms outperform the random-projection methods that seem otherwise theoretically optimal on worst case datasets. In this paper we aim to provide theoretical justification for this disparity.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…