Harmonic Decomposition in Data Sketches
Abstract
In the turnstile streaming model, a dynamic vector x=(x1,…,xn)∈ Zn is updated by a stream of entry-wise increments/decrements. Let f R+ be a symmetric function with f(0)=0. The f-moment of x is defined to be f(x) := Σv∈[n]f(xv). We revisit the problem of constructing a universal sketch that can estimate many different f-moments. Previous constructions of universal sketches rely on the technique of sampling with respect to the L0-mass (uniform samples) or L2-mass (L2-heavy-hitters), whose universality comes from being able to evaluate the function f over the samples. In this work we take a new approach to constructing a universal sketch that does not use any explicit samples but relies on the harmonic structure of the target function f. The new sketch (SymmetricPoissonTower) embraces hash collisions instead of avoiding them, which saves multiple n factors in space, e.g., when estimating all Lp-moments (f(z) = |z|p,p∈[0,2]). For many nearly periodic functions, the new sketch is exponentially more efficient than sampling-based methods. We conjecture that the SymmetricPoissonTower sketch is the universal sketch that can estimate every tractable function f.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.