In-Order Sliding-Window Aggregation in Worst-Case Constant Time
Abstract
Sliding-window aggregation is a widely-used approach for extracting insights from the most recent portion of a data stream. The aggregations of interest can usually be expressed as binary operators that are associative but not necessarily commutative nor invertible. Non-invertible operators, however, are difficult to support efficiently. In a 2017 conference paper, we introduced DABA, the first algorithm for sliding-window aggregation with worst-case constant time. Before DABA, if a window had size n, the best published algorithms would require O( n) aggregation steps per window operation---and while for strictly in-order streams, this bound could be improved to O(1) aggregation steps on average, it was not known how to achieve an O(1) bound for the worst-case, which is critical for latency-sensitive applications. This article is an extended version of our 2017 paper. Besides describing DABA in more detail, this article introduces a new variant, DABA Lite, which achieves the same time bounds in less memory. Whereas DABA requires space for storing 2n partial aggregates, DABA Lite only requires space for n+2 partial aggregates. Our experiments on synthetic and real data support the theoretical findings.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.