Approximate counting with a floating-point counter

Abstract

Memory becomes a limiting factor in contemporary applications, such as analyses of the Webgraph and molecular sequences, when many objects need to be counted simultaneously. Robert Morris [Communications of the ACM, 21:840--842, 1978] proposed a probabilistic technique for approximate counting that is extremely space-efficient. The basic idea is to increment a counter containing the value X with probability 2-X. As a result, the counter contains an approximation of n after n probabilistic updates stored in n bits. Here we revisit the original idea of Morris, and introduce a binary floating-point counter that uses a d-bit significand in conjunction with a binary exponent. The counter yields a simple formula for an unbiased estimation of n with a standard deviation of about 0.6· n2-d/2, and uses d+ n bits. We analyze the floating-point counter's performance in a general framework that applies to any probabilistic counter, and derive practical formulas to assess its accuracy.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…