Toward Optimal Regret in Robust Pricing: Decoupling Corruption and Time
Abstract
We design the first regret guarantees for robust dynamic pricing that decouple the dependence on the corruption C and the time horizon T. In dynamic pricing, a seller with unlimited supply of a good interacts with a stream of buyers over \( T \) rounds, with the goal of maximizing revenue. At each round t, the seller posts a price pt, and the buyer purchases the good only if their unknown valuation v exceeds this price. The seller observes only the binary feedback I \ pt ≤ v \, indicating whether a sale occurred. In the robust pricing setting, a malicious adversary is allowed to corrupt this feedback in at most C rounds. Even if the learner knows the corruption C, the best known regret bound is O(C T) by Gupta et al. [2025]. This leaves as an open problem to ``decouple'' the dependence on C and T. In this work, we resolve this open problem. In particular, we develop a robust variant of binary search that achieves regret O(C+ T) when the corruption C is known and O(C+2 T) when the corruption is unknown.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.