Preference Optimization Drives Monoculture in LLM Prediction Markets
Abstract
Prediction markets rest on the independence of participant errors. As LLM agents become active traders on platforms like Kalshi and Polymarket, we ask: does this independence hold when the crowd is composed of LLMs? We find it does not. LLM agents fine-tuned with Direct Preference Optimization (DPO) share a convergent output distribution, producing pairwise error correlations of ρ= 0.70 and reducing ten agents to the effective forecasting power of ≈1.4 independent forecasters Neff. This is not a scaling problem: Neff remains flat from N=5 to N=40, and the 10-agent market (67.6%) fails to match a single standalone agent (70.2%). Two controlled ablations isolate preference optimization as the causal driver, replicated across labs and scales (Δρ= +0.24 to +0.46 on identical-SFT controls at 8B and 70B). Among mitigations tested, cross-model diversity achieves the largest correlation reduction (ρ from 0.68 to 0.40). As LLMs become more aligned, markets built from them become more monocultural.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.