Enhance Multimodal Model Performance with Data Augmentation: Facebook Hateful Meme Challenge Solution

Abstract

Hateful content detection is one of the areas where deep learning can and should make a significant difference. The Hateful Memes Challenge from Facebook helps fulfill such potential by challenging the contestants to detect hateful speech in multi-modal memes using deep learning algorithms. In this paper, we utilize multi-modal, pre-trained models VilBERT and Visual BERT. We improved models' performance by adding training datasets generated from data augmentation. Enlarging the training data set helped us get a more than 2% boost in terms of AUROC with the Visual BERT model. Our approach achieved 0.7439 AUROC along with an accuracy of 0.7037 on the challenge's test set, which revealed remarkable progress.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…