Sequential stratified inference for the mean

Abstract

We develop conservative tests for the mean of a bounded population under stratified sampling and apply them to risk-limiting post-election audits. The tests are ``anytime valid'' under sequential sampling, allowing optional stopping in each stratum. Our core method expresses a global hypothesis about the population mean as a union of intersection hypotheses describing within-stratum means. It tests each intersection hypothesis using independent test supermartingales (TSMs) combined across strata by multiplication. A P-value for each intersection hypothesis is the reciprocal of that test statistic, and the largest P-value in the union is a P-value for the global hypothesis. This approach has two primary moving parts: the rule selecting which stratum to draw from next given the sample so far, and the form of the TSM within each stratum. These rules may vary over intersection hypotheses. We construct the test with the smallest expected stopping time and present a few strategies for approximating that optimum. In instances that arise in auditing and other applications, its expected sample size is substantially smaller than that of previous methods.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…