Account-History Features for Social Bot Detection in the Era of Large Language Models

Abstract

Bot detection on social platforms has historically relied on a mix of account-metadata features and features extracted from the text of posts and profile fields. The arrival of capable language models complicates the latter. A bot operator can run every post through GPT-4 or Claude and produce text whose surface statistics are difficult to distinguish from those of human writing, which weakens the predictive value of content-derived features. This paper asks how much of the detection problem can be solved by features that an attacker cannot easily manipulate at low cost: the age of the account, follower and friend counts and their ratios, profile completeness, and the structural properties of the handle. On a publicly redistributed corpus of 2,432 Twitter accounts with manually verified labels (43.0% bots), a random forest using only these account-history features achieves ROC-AUC of 0.977 in five-fold cross-validation, against 0.830 for a content-only baseline and 0.981 for the fusion model. The behavioral-versus-content gap is large and statistically significant by DeLong's test (z = 9.36, p < 0.001). We then evaluate two adversarial settings. In the first, we rewrite the text of bot tweets to match human surface statistics for URLs, hashtags, mentions, and casing; the content classifier's ROC-AUC degrades from 0.842 to 0.785 while the behavioral classifier is essentially unchanged. In the second, more aggressive setting we directly perturb the content feature values toward the human distribution; the content classifier falls below chance (AUC 0.466) while behavioral performance is invariant. We replicate the score distribution qualitatively on a 100-account sample of TwiBot-20. We conclude that operational bot detection should not treat content features as the primary signal; account-history features carry most of the load already and are not eroded by adversarial text rewriting.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…