Max-sum diversity via convex programming
Abstract
Diversity maximization is an important concept in information retrieval, computational geometry and operations research. Usually, it is a variant of the following problem: Given a ground set, constraints, and a function f(·) that measures diversity of a subset, the task is to select a feasible subset S such that f(S) is maximized. The sum-dispersion function f(S) = Σx,y ∈ S d(x,y), which is the sum of the pairwise distances in S, is in this context a prominent diversification measure. The corresponding diversity maximization is the max-sum or sum-sum diversification. Many recent results deal with the design of constant-factor approximation algorithms of diversification problems involving sum-dispersion function under a matroid constraint. In this paper, we present a PTAS for the max-sum diversification problem under a matroid constraint for distances d(·,·) of negative type. Distances of negative type are, for example, metric distances stemming from the 2 and 1 norm, as well as the cosine or spherical, or Jaccard distance which are popular similarity metrics in web and image search.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.