Efficient Construction of Neighborhood Graphs by the Multiple Sorting Method

Abstract

Neighborhood graphs are gaining popularity as a concise data representation in machine learning. However, naive graph construction by pairwise distance calculation takes O(n2) runtime for n data points and this is prohibitively slow for millions of data points. For strings of equal length, the multiple sorting method (Uno, 2008) can construct an ε-neighbor graph in O(n+m) time, where m is the number of ε-neighbor pairs in the data. To introduce this remarkably efficient algorithm to continuous domains such as images, signals and texts, we employ a random projection method to convert vectors to strings. Theoretical results are presented to elucidate the trade-off between approximation quality and computation time. Empirical results show the efficiency of our method in comparison to fast nearest neighbor alternatives.

0

Discussion (0)

Sign in to join the discussion.

Loading comments…