Approximating Large Frequency Moments with Pick-and-Drop Sampling

Abstract

Given data stream D = \p1,p2,...,pm\ of size m of numbers from \1,..., n\, the frequency of i is defined as fi = |\j: pj = i\|. The k-th frequency moment of D is defined as Fk = Σi=1n fik. We consider the problem of approximating frequency moments in insertion-only streams for k 3. For any constant c we show an O(n1-2/k(n)(c)(n)) upper bound on the space complexity of the problem. Here (c)(n) is the iterative function. To simplify the presentation, we make the following assumptions: n and m are polynomially far; approximation error ε and parameter k are constants. We observe a natural bijection between streams and special matrices. Our main technical contribution is a non-uniform sampling method on matrices. We call our method a pick-and-drop sampling; it samples a heavy element (i.e., element i with frequency (Fk)) with probability (1/n1-2/k) and gives approximation fi (1-ε)fi. In addition, the estimations never exceed the real values, that is fj fj for all j. As a result, we reduce the space complexity of finding a heavy element to O(n1-2/k(n)) bits. We apply our method of recursive sketches and resolve the problem with O(n1-2/k(n)(c)(n)) bits.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…