On the asymptotic behavior of the contaminated sample mean

Abstract

An observation of a cumulative distribution function F with finite variance is said to be contaminated according to the inflated variance model if it has a large probability of coming from the original target distribution F, but a small probability of coming from a contaminating distribution that has the same mean and shape as F, though a larger variance. It is well known that in the presence of data contamination, the ordinary sample mean looses many of its good properties, making it preferable to use more robust estimators. From a didactical point of view, it is insightful to see to what extent an intuitive estimator such as the sample mean becomes less favorable in a contaminated setting. In this paper, we investigate under which conditions the sample mean, based on a finite number of independent observations of F which are contaminated according to the inflated variance model, is a valid estimator for the mean of F. In particular, we examine to what extent this estimator is weakly consistent for the mean of F and asymptotically normal. As classical central limit theory is generally inaccurate to cope with the asymptotic normality in this setting, we invoke more general approximate central limit theory as developed by Berckmoes, Lowen, and Van Casteren (2013). Our theoretical results are illustrated by a specific example and a simulation study.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…