Optimal level set estimation for non-parametric tournament and crowdsourcing problems
Abstract
Motivated by crowdsourcing, we consider a problem where we partially observe the correctness of the answers of n experts on d questions. In this paper, we assume that both the experts and the questions can be ordered, namely that the matrix M containing the probability that expert i answers correctly to question j is bi-isotonic up to a permutation of it rows and columns. When n=d, this also encompasses the strongly stochastic transitive (SST) model from the tournament literature. Here, we focus on the relevant problem of deciphering small entries of M from large entries of M, which is key in crowdsourcing for efficient allocation of workers to questions. More precisely, we aim at recovering a (or several) level set p of the matrix up to a precision h, namely recovering resp. the sets of positions (i,j) in M such that Mij>p+h and Mi,j<p-h. We consider, as a loss measure, the number of misclassified entries. As our main result, we construct an efficient polynomial-time algorithm that turns out to be minimax optimal for this classification problem. This heavily contrasts with existing literature in the SST model where, for the stronger reconstruction loss, statistical-computational gaps have been conjectured. More generally, this shades light on the nature of statistical-computational gaps for permutations models.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.