Stochastic Block Models and Reconstruction
Abstract
The planted partition model (also known as the stochastic blockmodel) is a classical cluster-exhibiting random graph model that has been extensively studied in statistics, physics, and computer science. In its simplest form, the planted partition model is a model for random graphs on n nodes with two equal-sized clusters, with an between-class edge probability of q and a within-class edge probability of p. Although most of the literature on this model has focused on the case of increasing degrees (ie.\ pn, qn ∞ as n ∞), the sparse case p, q = O(1/n) is interesting both from a mathematical and an applied point of view. A striking conjecture of Decelle, Krzkala, Moore and Zdeborov\'a based on deep, non-rigorous ideas from statistical physics gave a precise prediction for the algorithmic threshold of clustering in the sparse planted partition model. In particular, if p = a/n and q = b/n, then Decelle et al.\ conjectured that it is possible to cluster in a way correlated with the true partition if (a - b)2 > 2(a + b), and impossible if (a - b)2 < 2(a + b). By comparison, the best-known rigorous result is that of Coja-Oghlan, who showed that clustering is possible if (a - b)2 > C (a + b) for some sufficiently large C. We prove half of their prediction, showing that it is indeed impossible to cluster if (a - b)2 < 2(a + b). Furthermore we show that it is impossible even to estimate the model parameters from the graph when (a - b)2 < 2(a + b); on the other hand, we provide a simple and efficient algorithm for estimating a and b when (a - b)2 > 2(a + b). Following Decelle et al, our work establishes a rigorous connection between the clustering problem, spin-glass models on the Bethe lattice and the so called reconstruction problem. This connection points to fascinating applications and open problems.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.