Graph Clustering with Surprise: Complexity and Exact Solutions
Abstract
Clustering graphs based on a comparison of the number of links within clusters and the expected value of this quantity in a random graph has gained a lot of attention and popularity in the last decade. Recently, Aldecoa and Marin proposed a related, but slightly different approach leading to the quality measure surprise, and reported good behavior in the context of synthetic and real world benchmarks. We show that the problem of finding a clustering with optimum surprise is NP-hard. Moreover, a bicriterial view on the problem permits to compute optimum solutions for small instances by solving a small number of integer linear programs, and leads to a polynomial time algorithm on trees.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.