Strong Data Processing Inequalities for Input Constrained Additive Noise Channels
Abstract
This paper quantifies the intuitive observation that adding noise reduces available information by means of non-linear strong data processing inequalities. Consider the random variables W X Y forming a Markov chain, where Y=X+Z with X and Z real-valued, independent and X bounded in Lp-norm. It is shown that I(W;Y) FI(I(W;X)) with FI(t)<t whenever t>0, if and only if Z has a density whose support is not disjoint from any translate of itself. A related question is to characterize for what couplings (W,X) the mutual information I(W;Y) is close to maximum possible. To that end we show that in order to saturate the channel, i.e. for I(W;Y) to approach capacity, it is mandatory that I(W;X)∞ (under suitable conditions on the channel). A key ingredient for this result is a deconvolution lemma which shows that post-convolution total variation distance bounds the pre-convolution Kolmogorov-Smirnov distance. Explicit bounds are provided for the special case of the additive Gaussian noise channel with quadratic cost constraint. These bounds are shown to be order-optimal. For this case simplified proofs are provided leveraging Gaussian-specific tools such as the connection between information and estimation (I-MMSE) and Talagrand's information-transportation inequality.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.