Maximum a posteriori learning in demand competition games

Abstract

We consider an inventory competition game between two firms. The question we address is this: If players do not know the opponent's action and opponent's utility function can they learn to play the Nash policy in a repeated game by observing their own sales? In this work it is proven that by means of Maximum A Posteriori (MAP) estimation, players can learn the Nash policy. It is proven that players' actions and beliefs do converge to the Nash equilibrium.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…