Domain-based user embedding for competing events on social media
Abstract
Social divide and polarization have become significant societal issues. To understand the mechanisms behind these phenomena, social media analysis offers research opportunities in computational social science, where developing effective user embedding methods is essential for subsequent analysis. Traditionally, researchers have used predefined network-based user features (e.g., network size, degree, and centrality measures). However, because such measures may not capture the complex characteristics of social media users, in our study we developed a method for embedding users based on a URL domain co-occurrence network. This approach effectively represents social media users involved in competing events such as political campaigns and public health crises. We assessed the method's performance using binary classification tasks and datasets that covered topics associated with the COVID-19 infodemic, such as QAnon, Biden, and Ivermectin, among Twitter users. Our results revealed that user embeddings generated directly from the retweet network and/or based on language performed below expectations, whereas our domain-based embeddings outperformed those methods while reducing computation time. Therefore, domain-based embedding offers an accessible and effective method for characterizing social media users in competing events.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.