Localization in Seeded PageRank
Abstract
Seeded PageRank is an important network analysis tool for identifying and studying regions nearby a given set of nodes, which are called seeds. The seeded PageRank vector is the stationary distribution of a random walk that randomly resets at the seed nodes. Intuitively, this vector is concentrated nearby the given seeds, but is mathematically non-zero for all nodes in a connected graph. We study this concentration, or localization, and show a sublinear upper bound on the number of entries required to approximate seeded PageRank on all graphs with a natural type of skewed-degree sequence---similar to those that arise in many real-world networks. Experiments with both real-world and synthetic graphs give further evidence to the idea that the degree sequence of a graph has a major influence on the localization behavior of seeded PageRank. Moreover, we establish that this localization is non-trivial by showing that complete-bipartite graphs produce seeded PageRank vectors that cannot be approximated with a sublinear number of non-zeros.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.