Playing games with knowledge: AI-Induced delusions need game theoretic interventions
Abstract
Conversational AI has a fundamental flaw as a knowledge interface: sycophantic chatbots induce epistemic entrenchment and delusional belief spirals even in rational agents. We propose the problem does not stem from the AI model, rooted instead in a systemic consequence of the paradigm shift from user-driven knowledge search to users and agents engaged in strategic, repeated-play communication. We formalize the problem as a Crawford-Sobel cheap talk game, where costless user signals induce a pooling equilibrium. Agents optimized for user satisfaction produce sycophantic strategies that provide identical reinforcement across user types with opposite epistemic incentives: exploratory ``Growth-seekers'' (θG) and confirmatory ``Validation-seekers'' (θV). Under repeated play, this identification failure creates a coordination trap -- analogous to a Prisoner's Dilemma -- where locally rational feedback loops drive users toward pathologically certain false beliefs. We propose an inference-time mechanism design intervention called an Epistemic Mediator that breaks this pooling equilibrium by introducing a costly signal (epistemic friction), forcing type revelation based on users' asymmetric cognitive costs for processing resistance. A key contribution is Belief Versioning, a git-inspired epistemic meta-memory system that stores healthy beliefs and rollbacks when validation-seeking resistance is detected. In simulation, this intervention achieves a separating equilibrium achieving a 48× differential in spiral rates while passing a learning preservation criterion), evidence that epistemic safety in AI is fundamentally a problem of strategic information environment design rather than simple model alignment.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.