CPU vs. GPU for Community Detection: Performance Insights from GVE-Louvain and -Louvain
Abstract
Community detection involves identifying natural divisions in networks, a crucial task for many large-scale applications. This report presents GVE-Louvain, one of the most efficient multicore implementations of the Louvain algorithm, a high-quality method for community detection. Running on a dual 16-core Intel Xeon Gold 6226R server, GVE-Louvain outperforms Vite, Grappolo, NetworKit Louvain, and cuGraph Louvain (on an NVIDIA A100 GPU) by factors of 50x, 22x, 20x, and 5.8x, respectively, achieving a processing rate of 560M edges per second on a 3.8B-edge graph. Additionally, it scales efficiently, improving performance by 1.6x for every thread doubling. The paper also presents -Louvain, a GPU-based implementation. When evaluated on an NVIDIA A100 GPU, -Louvain performs only on par with GVE-Louvain, largely due to reduced workload and parallelism in later algorithmic passes. These results suggest that CPUs, with their flexibility in handling irregular workloads, may be better suited for community detection tasks.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.