Active Distribution Learning from Indirect Samples

Abstract

This paper studies the problem of learning the probability distribution PX of a discrete random variable X using indirect and sequential samples. At each time step, we choose one of the possible K functions, g1, …, gK and observe the corresponding sample gi(X). The goal is to estimate the probability distribution of X by using a minimum number of such sequential samples. This problem has several real-world applications including inference under non-precise information and privacy-preserving statistical estimation. We establish necessary and sufficient conditions on the functions g1, …, gK under which asymptotically consistent estimation is possible. We also derive lower bounds on the estimation error as a function of total samples and show that it is order-wise achievable. Leveraging these results, we propose an iterative algorithm that i) chooses the function to observe at each step based on past observations; and ii) combines the obtained samples to estimate pX. The performance of this algorithm is investigated numerically under various scenarios, and shown to outperform baseline approaches.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…