Here Be Livestreams: Trade-offs in Creating Temporal Maps of Reddit
Abstract
We present a method for mapping Reddit communities that accounts for temporal shifts, using quantitative and qualitative analyses of clustering techniques to produce high-quality, stable, and meaningful maps for researchers, journalists and casual Reddit users. Building on previous work using community embeddings, we find that only a month of Reddit comments suffices to create snapshot embeddings that maintain quality while supporting insight into changes in Reddit communities over time. Comparing different clusterings of community embeddings with quantitative measures of quality and temporal stability, we describe properties of the models and what they tell us about the underlying Reddit data. Moreover, qualitative analysis of the resulting clusters illuminate which properties of clusterings are useful for analysis of Reddit communities. Although clusterings of subreddits have been used in many earlier works, we believe this is the first study to qualitatively analyze how these clusterings are perceived by social media researchers at a Reddit-wide scale. Finally, we demonstrate how the temporal snapshots might be used in exploratory study. We are able to identify particularly stable communities during 2021-2022, such as the Reddit Public Access Network, as well as emerging communities, like one focused on NFT trading. This work informed the development of a webtool for exploring Reddit now available to the public at RedditMap.social.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.