K-Nearest Neighbor Approximation Via the Friend-of-a-Friend Principle

Abstract

Suppose V is an n-element set where for each x ∈ V, the elements of V \x\ are ranked by their similarity to x. The K-nearest neighbor graph is a directed graph including an arc from each x to the K points of V \x\ most similar to x. Constructive approximation to this graph using far fewer than n2 comparisons is important for the analysis of large high-dimensional data sets. K-Nearest Neighbor Descent is a parameter-free heuristic where a sequence of graph approximations is constructed, in which second neighbors in one approximation are proposed as neighbors in the next. Run times in a test case fit an O(n K2 n) pattern. This bound is rigorously justified for a similar algorithm, using range queries, when applied to a homogeneous Poisson process in suitable dimension. However the basic algorithm fails to achieve subquadratic complexity on sets whose similarity rankings arise from a ``generic'' linear order on the n2 inter-point distances in a metric space.

0

Discussion (0)

Sign in to join the discussion.

Loading comments…