Large-Scale Graphs Community Detection using Spark GraphFrames
Abstract
With the emergence of social networks, online platforms dedicated to different use cases, and sensor networks, the emergence of large-scale graph community detection has become a steady field of research with real-world applications. Community detection algorithms have numerous practical applications, particularly due to their scalability with data size. Nonetheless, a notable drawback of community detection algorithms is their computational intensity~Apostol2014, resulting in decreasing performance as data size increases. For this purpose, new frameworks that employ distributed systems such as Apache Hadoop and Apache Spark which can seamlessly handle large-scale graphs must be developed. In this paper, we propose a novel framework for community detection algorithms, i.e., K-Cliques, Louvain, and Fast Greedy, developed using Apache Spark GraphFrames. We test their performance and scalability on two real-world datasets. The experimental results prove the feasibility of developing graph mining algorithms using Apache Spark GraphFrames.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.