Indicators for characterising online hate speech and its automatic detection

Abstract

We examined four case studies in the context of hate speech on Twitter in Italian from 2019 to 2020, aiming at comparing the classification of the 3,600 tweets made by expert pedagogists with the automatic classification made by machine learning algorithms. Pedagogists used a novel classification scheme based on seven indicators that characterize hate. These indicators are: the content is public, it affects a target group, it contains hate speech in explicit verbal form, it will not redeem, it has intention to harm, it can have a possible violent response, it incites hatred and violence. The case studies refer to Jews, Muslims, Roma, and immigrants target groups. We find that not all the types of hateful content are equally detectable by the machine learning algorithms that we considered. In particular, algorithms perform better in identifying tweets that incite hatred and violence, and those that can have possible violent response.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…