Estimating small frequency moments of data stream: a characteristic function approach

Abstract

A data stream is viewed as a sequence of M updates of the form (index,i,v) to an n-dimensional integer frequency vector f, where the update changes fi to fi + v, and v is an integer and assumed to be in \-m, ..., m\. The pth frequency moment Fp is defined as Σi=1n fip. We consider the problem of estimating Fp to within a multiplicative approximation factor of 1 ε, for p ∈ [0,2]. Several estimators have been proposed for this problem, including Indyk's median estimator indy:focs00, Li's geometric means estimator pinglib:2006, an -based estimator gc:random07. The first two estimators require space O(ε-2), where the O notation hides polylogarithmic factors in ε-1, m, n and M. Recently, Kane, Nelson and Woodruff in knw:soda10 present a space-optimal and novel estimator, called the log-cosine estimator. In this paper, we present an elementary analysis of the log-cosine estimator in a stand-alone setting. The analysis in knw:soda10 is more complicated.

0

Discussion (0)

Sign in to join the discussion.

Loading comments…