Do LLMs Track Public Opinion? A Multi-Model Study of Favorability Predictions in the 2024 U.S. Presidential Election
Abstract
We investigate whether Large Language Models (LLMs) can track public opinion as measured by exit polls during the 2024 U.S. presidential election cycle. Our analysis focuses on headline favorability (e.g., "Favorable" vs. "Unfavorable") of presidential candidates across multiple LLMs queried daily throughout the election season. Using the publicly available llm-election-data-2024 dataset, we evaluate predictions from nine LLM configurations against a curated set of five high-quality polls from major organizations including Reuters, CNN, Gallup, Quinnipiac, and ABC. We find systematic directional miscalibration. For Kamala Harris, all models overpredict favorability by 10-40% relative to polls. For Donald Trump, biases are smaller (5-10%) and poll-dependent, with substantially lower cross-model variation. These deviations persist under temporal smoothing and are not corrected by internet-augmented retrieval. We conclude that off-the-shelf LLMs do not reliably track polls when queried in a straightforward manner and discuss implications for election forecasting.