Dueling over Multiple Pieces of Dessert
Abstract
We study the dynamics of repeated fair division between two players, Alice and Bob, where Alice partitions a cake into two subsets and Bob chooses his preferred one over T rounds. Alice aims to minimize her regret relative to the Stackelberg value -- the maximum utility she could achieve if she knew Bob's private valuation. We show that if Alice uses arbitrary measurable partitions, achieving strongly sublinear regret is impossible; she suffers a regret of (T2 T) regret even against a myopic Bob. However, when Alice uses at most k cuts, the learning landscape becomes tractable. We analyze Alice's performance based on her knowledge of Bob's strategic sophistication (his regret budget). When Bob's learning rate is public, we establish a hierarchy of polynomial regret bounds determined by k and Bob's regret budget. In contrast, when this learning rate is private, Alice can universally guarantee O(T T) regret, but any attempt to secure a polynomial rate O(Tβ) (for β < 1) leaves her vulnerable to incurring strictly linear regret against some Bob. Finally, as a corollary of our online learning dynamics, we characterize the randomized query complexity of finding approximate Stackelberg allocations with a constant number of cuts in the Robertson-Webb model.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.