Resource Oblivious Sorting on Multicores
Abstract
We present a deterministic sorting algorithm, SPMS (Sample, Partition, and Merge Sort), that interleaves the partitioning of a sample sort with merging. Sequentially, it sorts n elements in O(n n) time cache-obliviously with an optimal number of cache misses. The parallel complexity (or critical path length) of the algorithm is O( n · n), which improves on previous bounds for optimal cache oblivious sorting. The algorithm also has low false sharing costs. When scheduled by a work-stealing scheduler in a multicore computing environment with a global shared memory and p cores, each having a cache of size M organized in blocks of size B, the costs of the additional cache misses and false sharing misses due to this parallel execution are bounded by the cost of O(S· M/B) and O(S · B) cache misses respectively, where S is the number of steals performed during the execution. Finally, SPMS is resource oblivious in Athat the dependence on machine parameters appear only in the analysis of its performance, and not within the algorithm itself.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.