Load Balancing Policies in Heterogeneous Systems: Non-Monotone Stability and Heavy-Traffic Optimality
Abstract
We consider a discrete-time queueing system with n heterogeneous parallel single-server queues. Jobs arrive at a central dispatcher and must be assigned immediately to one of the queues. We develop a unified framework for a broad family of load-balancing policies, including Join the Shortest Queue (JSQ), Join the Shortest Expected Delay (JSED), and Power-of-d Choices (Pod). In this framework, the dispatcher updates queue-length information periodically, possibly at arbitrarily long intervals, and dispatches jobs based on the sampled permutation of scaled queue lengths and the servers' service rates. Leveraging this structure, we derive a closed-form, easily verifiable sufficient condition for stability. We further show that, for general policies, stability above the induced threshold need not be monotone in the arrival rate, and we obtain an exact characterization under a persistent bottleneck dominance condition. When the stability condition holds strictly, we prove state-space collapse and heavy-traffic delay optimality. We also show that the steady-state queue-length vector converges in distribution to a deterministic vector scaled by an exponential random variable in heavy traffic. Methodologically, we extend Lyapunov-drift and transform techniques to a cycle-based analysis with multi-step updates. Our results connect the policy-induced dispatch fractions and sampled permutations to stability, delay, and distributional performance, providing guidance for designing scalable load-balancing schemes with limited queue-length information.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.