A Framework for Similarity Search with Space-Time Tradeoffs using Locality-Sensitive Filtering

Abstract

We present a framework for similarity search based on Locality-Sensitive Filtering (LSF), generalizing the Indyk-Motwani (STOC 1998) Locality-Sensitive Hashing (LSH) framework to support space-time tradeoffs. Given a family of filters, defined as a distribution over pairs of subsets of space with certain locality-sensitivity properties, we can solve the approximate near neighbor problem in d-dimensional space for an n-point data set with query time dnq+o(1), update time dnu+o(1), and space usage dn + n1 + u + o(1). The space-time tradeoff is tied to the tradeoff between query time and update time, controlled by the exponents q, u that are determined by the filter family. Locality-sensitive filtering was introduced by Becker et al. (SODA 2016) together with a framework yielding a single, balanced, tradeoff between query time and space, further relying on the assumption of an efficient oracle for the filter evaluation algorithm. We extend the LSF framework to support space-time tradeoffs and through a combination of existing techniques we remove the oracle assumption. Building on a filter family for the unit sphere by Laarhoven (arXiv 2015) we use a kernel embedding technique by Rahimi & Recht (NIPS 2007) to show a solution to the (r,cr)-near neighbor problem in sd-space for 0 < s ≤ 2 with query and update exponents q=cs(1+λ)2(cs+λ)2 and u=cs(1-λ)2(cs+λ)2 where λ∈[-1,1] is a tradeoff parameter. This result improves upon the space-time tradeoff of Kapralov (PODS 2015) and is shown to be optimal in the case of a balanced tradeoff. Finally, we show a lower bound for the space-time tradeoff on the unit sphere that matches Laarhoven's and our own upper bound in the case of random data.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…