Estimating small moments of data stream in nearly optimal space-time

Abstract

For each p ∈ (0,2], we present a randomized algorithm that returns an ε-approximation of the pth frequency moment of a data stream Fp = Σi = 1n fip. The algorithm requires space O(ε-2 (mM)( n)) and processes each stream update using time O(( n) ( ε-1)). It is nearly optimal in terms of space (lower bound O(ε-2 (mM)) as well as time and is the first algorithm with these properties. The technique separates heavy hitters from the remaining items in the stream using an appropriate threshold and estimates the contribution of the heavy hitters and the light elements to Fp separately. A key component is the design of an unbiased estimator for fip whose data structure has low update time and low variance.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…