Depression Risk Assessment in Social Media via Large Language Models

Abstract

Depression is one of the most prevalent and debilitating mental health conditions worldwide, frequently underdiagnosed and undertreated. The proliferation of social media platforms provides a rich source of naturalistic linguistic signals for the automated monitoring of psychological well-being. In this work, we propose a system based on Large Language Models (LLMs) for depression risk assessment in Reddit posts, through multi-label classification of eight depression-associated emotions and the computation of a weighted severity index. The method is evaluated in a zero-shot setting on the annotated DepressionEmo dataset (~6,000 posts) and applied in-the-wild to 469,692 comments collected from four subreddits over the period 2024-2025. Our best model, gemma3:27b, achieves micro-F1 = 0.75 and macro-F1 = 0.70, results competitive with purpose-built fine-tuned models (BART: micro-F1 = 0.80, macro-F1 = 0.76). The in-the-wild analysis reveals consistent and temporally stable risk profiles across communities, with marked differences between r/depression and r/anxiety. Our findings demonstrate the feasibility of a cost-effective, scalable approach for large-scale psychological monitoring.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…