Hierarchical Adversarial Bandits for Online Configuration Optimization

Abstract

Motivated by Online Configuration Optimization in large, dynamic parameter spaces, this work studies the nonstochastic multi-armed bandit (MAB) problem in metric action spaces with oblivious Lipschitz adversaries. We propose ABoB (Adversarial Bandit over Bandits), a hierarchical framework that decomposes the configuration space into clusters to accelerate learning and adapt to changing environments. We evaluate ABoB using standard algorithms such as EXP3 and Tsallis-INF on a real-world production storage system, demonstrating significant performance gains of up to 50\% compared to state-of-the-art "flat" bandit algorithms. Extensive simulations further confirm that ABoB effectively exploits metric structures, achieving up to 91\% improvement in adversarial metric scenarios while significantly reducing computational running time. Theoretical analysis grounds this empirical success: we prove that ABoB maintains a worst-case "safety net" bound of O(kT), matching traditional methods, where T is the number of rounds and k is the number of arms, while capable of accelerating learning to O(k1/4T) under favorable Lipschitz conditions. This combination of operational efficiency and theoretical soundness makes ABoB a practical solution for automated system tuning.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…