Continuous monitoring of p norms in data streams
Abstract
In insertion-only streaming, one sees a sequence of indices a1, a2, …, am∈ [n]. The stream defines a sequence of m frequency vectors x(1),…,x(m)∈Rn with (x(t))i = |\j : j∈[t], aj = i\|. That is, x(t) is the frequency vector after seeing the first t items in the stream. Much work in the streaming literature focuses on estimating some function f(x(m)). Many applications though require obtaining estimates at time t of f(x(t)), for every t∈[m]. Naively this guarantee is obtained by devising an algorithm with failure probability 1/m, then performing a union bound over all stream updates to guarantee that all m estimates are simultaneously accurate with good probability. When f(x) is some p norm of x, recent works have shown that this union bound is wasteful and better space complexity is possible for the continuous monitoring problem, with the strongest known results being for p=2 [HTY14, BCIW16, BCINWW17]. In this work, we improve the state of the art for all 0<p<2, which we obtain via a novel analysis of Indyk's p-stable sketch [Indyk06].
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.