BASE: Burst-Adaptive Autoscaling via Stacked Ensembles for SLO Assurance and Cost Efficiency

Abstract

Autoscaling is a technology that automatically scales resources for applications without human intervention to ensure runtime Quality of Service (QoS) while reducing costs. However, user-facing cloud applications serve dynamic workloads that often exhibit variability and contain bursts, posing challenges to autoscaling in maintaining QoS within Service-Level Objectives (SLOs). Conservative strategies risk over-provisioning, while aggressive ones may cause SLO violations, making it more challenging to design effective autoscaling. This paper introduces BASE, a burst-adaptive autoscaling framework that leverages a stacked ensemble of machine learning models to mitigate SLO violations and reduce costs for containerized services and applications operating under time-varying workloads. BASE incorporates a novel prediction-based burst detection mechanism that distinguishes between predictable workload spikes and actual uncertain bursts. When bursts are detected, BASE appropriately overestimates them and allocates resources accordingly to address the rapid growth in resource demand. On the other hand, BASE employs reinforcement learning to rectify potential inaccuracies in resource estimation, enabling more precise resource allocation during non-burst periods. Experiments across ten real-world workloads demonstrate BASE's effectiveness, achieving a significant reduction in SLO violations with lower resource costs compared to other prominent methods.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…