Approximate Convex Hull of Data Streams

Abstract

Given a finite set of points P ⊂eq Rd, we would like to find a small subset S ⊂eq P such that the convex hull of S approximately contains P. More formally, every point in P is within distance ε from the convex hull of S. Such a subset S is called an ε-hull. Computing an ε-hull is an important problem in computational geometry, machine learning, and approximation algorithms. In many real world applications, the set P is too large to fit in memory. We consider the streaming model where the algorithm receives the points of P sequentially and strives to use a minimal amount of memory. Existing streaming algorithms for computing an ε-hull require O(ε-(d-1)/2) space, which is optimal for a worst-case input. However, this ignores the structure of the data. The minimal size of an ε-hull of P, which we denote by OPT, can be much smaller. A natural question is whether a streaming algorithm can compute an ε-hull using only O(OPT) space. We begin with lower bounds that show that it is not possible to have a single-pass streaming algorithm that computes an ε-hull with O(OPT) space. We instead propose three relaxations of the problem for which we can compute ε-hulls using space near-linear to the optimal size. Our first algorithm for points in R2 that arrive in random-order uses O( n· OPT) space. Our second algorithm for points in R2 makes O((1ε)) passes before outputting the ε-hull and requires O(OPT) space. Our third algorithm for points in Rd for any fixed dimension d outputs an ε-hull for all but δ-fraction of directions and requires O(OPT · OPT) space.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…