Hybrid Static/Dynamic Schedules for Tiled Polyhedral Programs

Abstract

Polyhedral compilers perform optimizations such as tiling and parallelization; when doing both, they usually generate code that executes "barrier-synchronized wavefronts" of tiles. We present a system to express and generate code for hybrid schedules, where some constraints are automatically satisfied through the structure of the code, and the remainder are dynamically enforced at run-time with data flow mechanisms. We prove bounds on the added overheads that are better, by at least one polynomial degree, than those of previous techniques. We propose a generic mechanism to implement the needed synchronization, and show it can be easily realized for a variety of targets: OpenMP, Pthreads, GPU (CUDA or OpenCL) code, languages like X10, Habanero, Cilk, as well as data flow platforms like DAGuE, and OpenStream and MPI. We also provide a simple concrete implementation that works without the need of any sophisticated run-time mechanism. Our experiments show our simple implementation to be competitive or better than the wavefront-synchronized code generated by other systems. We also show how the proposed mechanism can achieve 24% to 70% reduction in energy.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…