A Channel Coding Perspective of Collaborative Filtering

Abstract

We consider the problem of collaborative filtering from a channel coding perspective. We model the underlying rating matrix as a finite alphabet matrix with block constant structure. The observations are obtained from this underlying matrix through a discrete memoryless channel with a noisy part representing noisy user behavior and an erasure part representing missing data. Moreover, the clusters over which the underlying matrix is constant are unknown. We establish a sharp threshold result for this model: if the largest cluster size is smaller than C1 (mn) (where the rating matrix is of size m × n), then the underlying matrix cannot be recovered with any estimator, but if the smallest cluster size is larger than C2 (mn), then we show a polynomial time estimator with diminishing probability of error. In the case of uniform cluster size, not only the order of the threshold, but also the constant is identified.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…