Coding sets with asymmetric information
Abstract
We study the following one-way asymmetric transmission problem, also a variant of model-based compressed sensing: a resource-limited encoder has to report a small set S from a universe of N items to a more powerful decoder (server). The distinguishing feature is asymmetric information: the subset S is comprised of i.i.d. samples from a prior distribution μ, and μ is only known to the decoder. The goal for the encoder is to encode S obliviously, while achieving the information-theoretic bound of |S| · H(μ), i.e., the Shannon entropy bound. We first show that any such compression scheme must be randomized, if it gains non-trivially from the prior μ. This stands in contrast to the symmetric case (when both the encoder and decoder know μ), where the Huffman code provides a near-optimal deterministic solution. On the other hand, a rather simple argument shows that, when |S|=k, a random linear code achieves near-optimal communication rate of about k· H(μ) bits. Alas, the resulting scheme has prohibitive decoding time: about N k ≈ (N/k)k. Our main result is a computationally efficient and linear coding scheme, which achieves an O( N)-competitive communication ratio compared to the optimal benchmark, and runs in poly(N,k) time. Our "multi-level" coding scheme uses a combination of hashing and syndrome-decoding of Reed-Solomon codes, and relies on viewing the (unknown) prior μ as a rather small convex combination of uniform ("flat") distributions.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.