Engineering Shared-Memory Parallel Shuffling to Generate Random Permutations In-Place
Abstract
Shuffling is the process of rearranging a sequence of elements into a random order such that any permutation occurs with equal probability. It is an important building block in a plethora of techniques used in virtually all scientific areas. Consequently considerable work has been devoted to the design and implementation of shuffling algorithms. We engineer, -- to the best of our knowledge -- for the first time, a practically fast, parallel shuffling algorithm with n n parallel depth that requires only poly-logarithmic auxiliary memory. Our reference implementations in Rust are freely available, easy to include in other projects, and can process large data sets approaching the size of the system's memory. In an empirical evaluation, we compare our implementations with a number of existing solutions on various computer architectures. Our algorithms consistently achieve the highest through-put on all machines. Further, we demonstrate that the runtime of our parallel algorithm is comparable to the time that other algorithms may take to acquire the memory from the operating system to copy the input.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.