On the LSH Distortion of Ulam and Cayley Similarities
Abstract
Locality-sensitive hashing (LSH) has found widespread use as a fundamental primitive, particularly to accelerate nearest neighbor search. An LSH scheme for a similarity function S:X × X [0,1] is a distribution over hash functions on X with the property that the probability of collision of any two elements x,y∈ X is exactly equal to S(x,y). However, not all similarity functions admit exact LSH schemes. The notion of LSH distortion measures how multiplicatively close a similarity function is to having an LSH scheme. In this work, we study the LSH distortion of the Ulam and Cayley similarities, which are popular similarity measures on permutations of n elements. We show that the Ulam similarity admits a sublinear LSH distortion of O(n / n); we also prove a lower bound of (n0.12) on the best LSH distortion achievable. On the other hand, we show that the LSH distortion of the Cayley similarity is (n).
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.