Convergence of Stochastic First-Order Algorithms in Bertrand Competition Under Incomplete Information
Abstract
Autonomous pricing agents are widely deployed in online marketplaces, making algorithmic pricing a prominent application of multi-agent learning. Experimental studies often report collusive outcomes, but these findings typically rely on Q-learning in complete-information environments and lack rigorous convergence guarantees. In this paper, we study the stochastic learning dynamics of Regularized Robbins-Monro (RRM) algorithms in a Bayesian Bertrand competition with private costs. We show that this setting violates standard stability conditions, including monotonicity and the Minty variational inequality, rendering classical convergence results for gradient-based learning inapplicable. Despite this, we prove that Euclidean RRM algorithms converge almost surely to the unique, efficient Bayes-Nash equilibrium within a finite-dimensional approximation of the strategy space. By analyzing symmetric piecewise-linear pricing strategies in a duopoly, we explicitly construct a global Lyapunov function for the projected primal dynamics and establish global asymptotic stability of the equilibrium. Our analysis yields rigorous convergence guarantees for stochastic first-order learning algorithms in Bayesian Bertrand competition and provides a principled counterpoint to widespread claims of algorithmic collusion.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.