Probing the Stochastic Machine: Engaging with LLMs in Statistics Curricula Through Veridical Data Science
Abstract
Large language models (LLMs) are interactive stochastic systems whose most consequential behaviors are still only partially understood. This discussion argues that statistics curricula should treat LLMs not only as tools, but as objects of inquiry: students can probe variability, bias, and prompt sensitivity by designing small experiments and analyzing distributions of outputs. Building on the Veridical Data Science framework and Predictability-Computability-Stability (PCS) principles, this discussion outlines how to organize critical LLM engagement across educational levels and propose four curricular examples, from introductory ``ask it twice'' activities to graduate PCS stability audits of LLM-based analysis workflows.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.