Memory-Sample Lower Bounds for Learning Parity with Noise
Abstract
In this work, we show, for the well-studied problem of learning parity under noise, where a learner tries to learn x=(x1,…,xn) ∈ \0,1\n from a stream of random linear equations over F2 that are correct with probability 12+ and flipped with probability 12-, that any learning algorithm requires either a memory of size (n2/) or an exponential number of samples. In fact, we study memory-sample lower bounds for a large class of learning problems, as characterized by [GRT'18], when the samples are noisy. A matrix M: A × X → \-1,1\ corresponds to the following learning problem with error parameter : an unknown element x ∈ X is chosen uniformly at random. A learner tries to learn x from a stream of samples, (a1, b1), (a2, b2) …, where for every i, ai ∈ A is chosen uniformly at random and bi = M(ai,x) with probability 1/2+ and bi = -M(ai,x) with probability 1/2- (0<< 12). Assume that k,, r are such that any submatrix of M of at least 2-k · |A| rows and at least 2- · |X| columns, has a bias of at most 2-r. We show that any learning algorithm for the learning problem corresponding to M, with error, requires either a memory of size at least (k · ), or at least 2(r) samples. In particular, this shows that for a large class of learning problems, same as those in [GRT'18], any learning algorithm requires either a memory of size at least (( |X|) · ( |A|)) or an exponential number of noisy samples. Our proof is based on adapting the arguments in [Raz'17,GRT'18] to the noisy case.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.