The Effect of Class Imbalance and Order on Crowdsourced Relevance Judgments
Abstract
In this paper we study the effect on crowd worker efficiency and effectiveness of the dominance of one class in the data they process. We aim at understanding if there is any positive or negative bias in workers seeing many negative examples in the identification of positive labels. To test our hypothesis, we design an experiment where crowd workers are asked to judge the relevance of documents presented in different orders. Our findings indicate that there is a significant improvement in the quality of relevance judgements when presenting relevant results before the non-relevant ones.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.