Online Learning for Approximately-Convex Functions with Long-term Adversarial Constraints
Abstract
We study an online learning problem with long-term budget constraints in the adversarial setting. In this problem, at each round t, the learner selects an action from a convex decision set, after which the adversary reveals a cost function ft and a resource consumption function gt. The cost and consumption functions are assumed to be α-approximately convex - a broad class that generalizes convexity and encompasses many common non-convex optimization problems, including DR-submodular maximization, Online Vertex Cover, and Regularized Phase Retrieval. The goal is to design an online algorithm that minimizes cumulative cost over a horizon of length T while approximately satisfying a long-term budget constraint of BT. We propose an efficient first-order online algorithm that guarantees O(T) α-regret against the optimal fixed feasible benchmark while consuming at most O(BT T)+ O(T) resources in both full-information and bandit feedback settings. In the bandit feedback setting, our approach yields an efficient solution for the Adversarial Bandits with Knapsacks problem with improved guarantees. We also prove matching lower bounds, demonstrating the tightness of our results. Finally, we characterize the class of α-approximately convex functions and show that our results apply to a broad family of problems.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.