Maximum Coverage in the Data Stream Model: Parameterized and Generalized
Abstract
We present algorithms for the Max-Cover and Max-Unique-Cover problems in the data stream model. The input to both problems are m subsets of a universe of size n and a value k∈ [m]. In Max-Cover, the problem is to find a collection of at most k sets such that the number of elements covered by at least one set is maximized. In Max-Unique-Cover, the problem is to find a collection of at most k sets such that the number of elements covered by exactly one set is maximized. Our goal is to design single-pass algorithms that use space that is sublinear in the input size. Our main algorithmic results are: If the sets have size at most d, there exist single-pass algorithms using O(dd+1 kd) space that solve both problems exactly. This is optimal up to polylogarithmic factors for constant d. If each element appears in at most r sets, we present single pass algorithms using O(k2 r/ε3) space that return a 1+ε approximation in the case of Max-Cover. We also present a single-pass algorithm using slightly more memory, i.e., O(k3 r/ε4) space, that 1+ε approximates Max-Unique-Cover. In contrast to the above results, when d and r are arbitrary, any constant pass 1+ε approximation algorithm for either problem requires (ε-2m) space but a single pass O(ε-2mk) space algorithm exists. In fact any constant-pass algorithm with an approximation better than e/(e-1) and e1-1/k for Max-Cover and Max-Unique-Cover respectively requires (m/k2) space when d and r are unrestricted. En route, we also obtain an algorithm for a parameterized version of the streaming Set-Cover problem.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.