Asymptotically Optimal Scheduling of Multiple Parallelizable Job Classes
Abstract
Modern computing workloads are often composed of parallelizable jobs. A parallelizable job can be completed more quickly when run on additional servers. However, each job can only use a limited number of servers, known as its parallelizability level, which is determined by the type of computation the job performs and how it is implemented. Workloads generally consist of multiple job classes, where jobs from different classes have different parallelizability levels and follow different job size (service requirement) distributions. This paper considers scheduling parallelizable jobs belonging to an arbitrary number of job classes. Given a limited number of servers, we must allocate servers across a stream of arriving jobs to minimize mean response time -- the average time from when a job arrives to the system until it completes. We find that in lighter-load scaling regimes (i.e., Sub-Halfin-Whitt), the optimal allocation policy is Least-Parallelizable-First (LPF), which prioritizes jobs from the least parallelizable job classes regardless of their size distributions. By contrast, we find that in the heavier-load regimes (i.e., Super-NDS), the optimal allocation policy prioritizes jobs with the Shortest Expected Remaining Processing Time (SERPT). We also develop policies that are asymptotically optimal when the scaling regime is not known a priori.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.