Tight Bounds for Low-Error Frequency Moment Estimation and the Power of Multiple Passes
Abstract
Estimating the second frequency moment F2 of a data stream up to a (1 ) factor is a central problem in the streaming literature. For errors > (1/n), the tight bound ((2 n)/2) was recently established by Braverman and Zamir. In this work, we complete the picture by resolving the remaining regime of small error, < 1/n, showing that the optimal space complexity is ( (n, 12 ) · (1 + | (2 n) | ) ) bits for all ≥ 1/n2, assuming a sufficiently large universe. This closes the gap between the best known (n) lower bound and the straightforward O(n n) upper bound in that range, and shows that essentially storing the entire stream is necessary for high-precision estimation. To derive this bound, we fully characterize the two-party communication complexity of estimating the size of a set intersection up to an arbitrary additive error n. In particular, we prove a tight (n n) lower bound for one-way communication protocols when < n-1/2-(1), in contrast to classical O(n)-bit protocols that use two-way communication. Motivated by this separation, we present a two-pass streaming algorithm that computes the exact histogram of a stream with high probability using only O(n n) bits of space, in contrast to the (n n) bits required in one pass even to approximate F2 with small error. This yields the first asymptotic separation between one-pass and O(1)-passes space complexity for small frequency moment estimation.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.