Parity Queries for Binary Classification

Abstract

Consider a query-based data acquisition problem that aims to recover the values of k binary variables from parity (XOR) measurements of chosen subsets of the variables. Assume the response model where only a randomly selected subset of the measurements is received. We propose a method for designing a sequence of queries so that the variables can be identified with high probability using as few (n) measurements as possible. We define the query difficulty d as the average size of the query subsets and the sample complexity n as the minimum number of measurements required to attain a given recovery accuracy. We obtain fundamental trade-offs between recovery accuracy, query difficulty, and sample complexity. In particular, the necessary and sufficient sample complexity required for recovering all k variables with high probability is n = c0 \k, (k k)/d\ and the sample complexity for recovering a fixed proportion (1-δ)k of the variables for δ=o(1) is n = c1\k, (k (1/δ))/d\, where c0, c1>0.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…