Two-Sample Inference in Highly Dispersed Negative Binomial Models

Derek Bean

Two-Sample Inference in Highly Dispersed Negative Binomial Models

Abstract

Two-sample inference for the difference of population means typically relies upon a Central Limit Theorem approximation. When data are drawn from a Negative Binomial distribution, previous work of Shilane et al. (2010) showed that a Normal approximation is often unreliable in one-sample inference and proposed alternative techniques. We seek to extend these methods to the problem of two-sample inference on the difference of sample means in highly dispersed Negative Binomial models. We demonstrate that the Normal approximation is considerably more robust in the two-sample setting and may often be applied even when it is not appropriate for either sample individually. We also provide an intuitive extension of Bernstein's Inequality to the two-sample case. A simple mixture of these two methods may improve coverage in borderline and small sample settings. We subsequently investigate the coverage quality of confidence intervals and sample size considerations in a wide variety of simulation studies. Overall, we demonstrate that the Normal approximation is considerably more robust in approximating the distribution of the mean difference than the corresponding one-sample theory would suggest.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Or compile a full topic from this idea

Discussion (0)

Sign in to join the discussion.

Loading comments…