Under-coverage in high-statistics counting experiments with finite MC samples

Abstract

We consider the problem of setting confidence intervals on a parameter of interest from the maximum-likelihood fit of a physics model to a binned data set with a large number of bins, large event-counts per bin, and in the presence of systematic uncertainties modeled as nuisance parameters. We use the profile-likelihood ratio for statistical inference and focus on the case in which the model is determined from Monte Carlo simulated samples of finite size. We start by presenting a toy model in which the properties of widely used approximations of the profile-likelihood ratio in the asymptotic limit, which are commonly expected to hold in the high-statistics regime, are manifestly broken even if the numbers of events per bin in both the data and simulated samples are seemingly large enough to warrant their validity. We then move to the general setting to show how statistical uncertainties in the Monte Carlo predictions can affect the coverage of confidence intervals constructed in the asymptotic approximation always in the same direction, namely they lead to systematic under-coverage.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…