SLING: A Near-Optimal Index Structure for SimRank
Abstract
SimRank is a similarity measure for graph nodes that has numerous applications in practice. Scalable SimRank computation has been the subject of extensive research for more than a decade, and yet, none of the existing solutions can efficiently derive SimRank scores on large graphs with provable accuracy guarantees. In particular, the state-of-the-art solution requires up to a few seconds to compute a SimRank score in million-node graphs, and does not offer any worst-case assurance in terms of the query error. This paper presents SLING, an efficient index structure for SimRank computation. SLING guarantees that each SimRank score returned has at most additive error, and it answers any single-pair and single-source SimRank queries in O(1/) and O(n/) time, respectively. These time complexities are near-optimal, and are significantly better than the asymptotic bounds of the most recent approach. Furthermore, SLING requires only O(n/) space (which is also near-optimal in an asymptotic sense) and O(m/ + n nδ/2) pre-computation time, where δ is the failure probability of the preprocessing algorithm. We experimentally evaluate SLING with a variety of real-world graphs with up to several millions of nodes. Our results demonstrate that SLING is up to 10000 times (resp. 110 times) faster than competing methods for single-pair (resp. single-source) SimRank queries, at the cost of higher space overheads.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.