When is Yobs missing and Ymis observed?

Abstract

In statistical modelling of incomplete data, missingness is encoded as a relation between datasets Y and response patterns R. The partitioning of Y into observed and missing components is often denoted Yobs and Ymis. We point out a mathematical defect in this notation which results from two different mathematical relationships between Y and R not being distinguished, (Yobs, Ymis, R) in which Yobs values are always observed, and Ymis values are always missing, and the overlaying of a missingness pattern onto the marginal distribution for Y, denoted (Yobs, Ymis). With the latter, Yobs and Ymis each denote mixtures of observable and unobservable data. This overlaying of the missingness pattern onto Y creates a link between the mathematics and the meta-mathematics which violates the stochastic relationship encoded in (Y, R). Additionally, in the theory there is a need to compare partitions of Y according to different missingness patterns simultaneously. A simple remedy for these problems is to use four symbols instead of two, and to make the dependence on the missingness pattern explicit. We explain these and related issues.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…