On Estimating the First Frequency Moment of Data Streams
Abstract
Estimating the first moment of a data stream defined as F1 = Σi ∈ \1, 2, …, n\ fi to within 1 ε-relative error with high probability is a basic and influential problem in data stream processing. A tight space bound of O(ε-2 (mM)) is known from the work of [Kane-Nelson-Woodruff-SODA10]. However, all known algorithms for this problem require per-update stream processing time of (ε-2), with the only exception being the algorithm of [Ganguly-Cormode-RANDOM07] that requires per-update processing time of O(2(mM)( n)) albeit with sub-optimal space O(ε-32(mM)). In this paper, we present an algorithm for estimating F1 that achieves near-optimality in both space and update processing time. The space requirement is O(ε-2( n + ( ε-1)(mM))) and the per-update processing time is O( ( n) (ε-1)).