Learning in Random Utility Models Via Online Decision Problems
Abstract
This paper examines the Random Utility Model (RUM) in repeated stochastic choice settings where decision-makers lack full information about payoffs. We propose a gradient-based learning algorithm that embeds RUM into an online decision-making framework. Our analysis establishes Hannan consistency for a broad class of RUMs, meaning the average regret relative to the best fixed action in hindsight vanishes over time. We also show that our algorithm is equivalent to the Follow-The-Regularized-Leader (FTRL) method, offering an economically grounded approach to online optimization. Applications include modeling recency bias and characterizing coarse correlated equilibria in normal-form games
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.