An Entropy Sumset Inequality and Polynomially Fast Convergence to Shannon Capacity Over All Alphabets
Abstract
We prove a lower estimate on the increase in entropy when two copies of a conditional random variable X | Y, with X supported on Zq=\0,1,…,q-1\ for prime q, are summed modulo q. Specifically, given two i.i.d copies (X1,Y1) and (X2,Y2) of a pair of random variables (X,Y), with X taking values in Zq, we show \[ H(X1 + X2 Y1, Y2) - H(X|Y) α(q) · H(X|Y) (1-H(X|Y)) \] for some α(q) > 0, where H(·) is the normalized (by factor 2 q) entropy. Our motivation is an effective analysis of the finite-length behavior of polar codes, and the assumption of q being prime is necessary. For X supported on infinite groups without a finite subgroup and no conditioning, a sumset inequality for the absolute increase in (unnormalized) entropy was shown by Tao (2010). We use our sumset inequality to analyze Arkan's construction of polar codes and prove that for any q-ary source X, where q is any fixed prime, and any ε > 0, polar codes allow efficient data compression of N i.i.d. copies of X into (H(X)+ε)N q-ary symbols, as soon as N is polynomially large in 1/ε. We can get capacity-achieving source codes with similar guarantees for composite alphabets, by factoring q into primes and combining different polar codes for each prime in factorization. A consequence of our result for noisy channel coding is that for all discrete memoryless channels, there are explicit codes enabling reliable communication within ε > 0 of the symmetric Shannon capacity for a block length and decoding complexity bounded by a polynomial in 1/ε. The result was previously shown for the special case of binary input channels (Guruswami-Xia '13 and Hassani-Alishahi-Urbanke '13), and this work extends the result to channels over any alphabet.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.