Better Streaming Algorithms for the Maximum Coverage Problem
Abstract
We study the classic NP-Hard problem of finding the maximum k-set coverage in the data stream model: given a set system of m sets that are subsets of a universe \1,…,n \, find the k sets that cover the most number of distinct elements. The problem can be approximated up to a factor 1-1/e in polynomial time. In the streaming-set model, the sets and their elements are revealed online. The main goal of our work is to design algorithms, with approximation guarantees as close as possible to 1-1/e, that use sublinear space o(mn). Our main results are: Two (1-1/e-ε) approximation algorithms: One uses O(ε-1) passes and O(ε-2 k) space whereas the other uses only a single pass but O(ε-2 m) space. We show that any approximation factor better than (1-(1-1/k)k) in constant passes requires (m) space for constant k even if the algorithm is allowed unbounded processing time. We also demonstrate a single-pass, (1-ε) approximation algorithm using O(ε-2 m · (k,ε-1)) space. We also study the maximum k-vertex coverage problem in the dynamic graph stream model. In this model, the stream consists of edge insertions and deletions of a graph on N vertices. The goal is to find k vertices that cover the most number of distinct edges. We show that any constant approximation in constant passes requires (N) space for constant k whereas O(ε-2N) space is sufficient for a (1-ε) approximation and arbitrary k in a single pass. For regular graphs, we show that O(ε-3k) space is sufficient for a (1-ε) approximation in a single pass. We generalize this to a (-ε) approximation when the ratio between the minimum and maximum degree is bounded below by .
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.