Weighted Matching in a Poly-Streaming Model
Abstract
We introduce the poly-streaming model, a generalization of streaming models of computation in which k processors process k data streams containing a total of N items. The algorithm is allowed O(f(k)· M1) space, where M1 is either o(N) or the space bound for a sequential streaming algorithm. Processors may communicate as needed. Algorithms are assessed by the number of passes, per-item processing time, total runtime, space usage, communication cost, and solution quality. We design a single-pass algorithm in this model for approximating the maximum weight matching (MWM) problem. Given k edge streams and a parameter > 0, the algorithm computes a (2+ε)-approximate MWM. We analyze its performance in a shared-memory parallel setting: for any constant > 0, it runs in time O(L+n), where n is the number of vertices and L is the maximum stream length. It supports O(1) per-edge processing time using O(k· n) space. We further generalize the design to hierarchical architectures, in which k processors are partitioned into r groups, each with its own shared local memory. The total intergroup communication is O(r · n) bits, while all other performance guarantees are preserved. We evaluate the algorithm on a shared-memory system using graphs with trillions of edges. It achieves substantial speedups as k increases and produces matchings with weights significantly exceeding the theoretical guarantee. On our largest test graph, it reduces runtime by nearly two orders of magnitude and memory usage by five orders of magnitude compared to an offline algorithm.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.