Probability Based Clustering for Document and User Properties
Abstract
Information Retrieval systems can be improved by exploiting context information such as user and document features. This article presents a model based on overlapping probabilistic or fuzzy clusters for such features. The model is applied within a fusion method which linearly combines several retrieval systems. The fusion is based on weights for the different retrieval systems which are learned by exploiting relevance feedback information. This calculation can be improved by maintaining a model for each document and user cluster. That way, the optimal retrieval system for each document or user type can be identified and applied. The extension presented in this article allows overlapping, probabilistic clusters of features to further refine the process.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.