Approximate k-flat Nearest Neighbor Search

Abstract

Let k be a nonnegative integer. In the approximate k-flat nearest neighbor (k-ANN) problem, we are given a set P ⊂ Rd of n points in d-dimensional space and a fixed approximation factor c > 1. Our goal is to preprocess P so that we can efficiently answer approximate k-flat nearest neighbor queries: given a k-flat F, find a point in P whose distance to F is within a factor c of the distance between F and the closest point in P. The case k = 0 corresponds to the well-studied approximate nearest neighbor problem, for which a plethora of results are known, both in low and high dimensions. The case k = 1 is called approximate line nearest neighbor. In this case, we are aware of only one provably efficient data structure, due to Andoni, Indyk, Krauthgamer, and Nguyen. For k ≥ 2, we know of no previous results. We present the first efficient data structure that can handle approximate nearest neighbor queries for arbitrary k. We use a data structure for 0-ANN-queries as a black box, and the performance depends on the parameters of the 0-ANN solution: suppose we have an 0-ANN structure with query time O(n) and space requirement O(n1+σ), for , σ > 0. Then we can answer k-ANN queries in time O(nk/(k + 1 - ) + t) and space O(n1+σ k/(k + 1 - ) + nO(1/t) n). Here, t > 0 is an arbitrary constant and the O-notation hides exponential factors in k, 1/t, and c and polynomials in d. Our new data structures also give an improvement in the space requirement over the previous result for 1-ANN: we can achieve near-linear space and sublinear query time, a further step towards practical applications where space constitutes the bottleneck.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…