Density estimation from batched broken random samples
Abstract
The broken random sample problem was first introduced by DeGroot, Feder, and Gole (1971, Ann. Math. Statist.): in each observation (batch), a random sample of M i.i.d. point pairs ((Xi,Yi))i=1M is drawn from a joint distribution with density p(x,y), but we can observe only the unordered multisets (Xi)i=1M and (Yi)i=1M separately; that is, the pairing information is lost. For large M, inferring p from a single observation has been shown to be essentially impossible. In this paper, we propose a parametric method based on a pseudo-log-likelihood to estimate p from N i.i.d. broken sample batches, and we prove a fast convergence rate in N for our estimator that is uniform in M, under mild assumptions.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.