When quoting this document, please refer to the following
DOI: 10.4230/LIPIcs.CCC.2015.42
URN: urn:nbn:de:0030-drops-50755
URL: https://drops.dagstuhl.de/opus/volltexte/2015/5075/
 Go to the corresponding LIPIcs Volume Portal

### An Entropy Sumset Inequality and Polynomially Fast Convergence to Shannon Capacity Over All Alphabets

 pdf-format:

### Abstract

We prove a lower estimate on the increase in entropy when two copies of a conditional random variable X | Y, with X supported on Z_q={0,1,...,q-1} for prime q, are summed modulo q. Specifically, given two i.i.d. copies (X_1,Y_1) and (X_2,Y_2) of a pair of random variables (X,Y), with X taking values in Z_q, we show

H(X_1 + X_2 \mid Y_1, Y_2) - H(X|Y) >=e alpha(q) * H(X|Y) (1-H(X|Y))

for some alpha(q) > 0, where H(.) is the normalized (by factor log_2(q)) entropy. In particular, if X | Y is not close to being fully random or fully deterministic and H(X| Y) \in (gamma,1-gamma), then the entropy of the sum increases by Omega_q(gamma). Our motivation is an effective analysis of the finite-length behavior of polar codes, for which the linear dependence on gamma is quantitatively important. The assumption of q being prime is necessary: for X supported uniformly on a proper subgroup of Z_q we have H(X+X)=H(X). For X supported on infinite groups without a finite subgroup (the torsion-free case) and no conditioning, a sumset inequality for the absolute increase in (unnormalized) entropy was shown by Tao in [Tao, CP&R 2010].

We use our sumset inequality to analyze Ari kan's construction of polar codes and prove that for any q-ary source X, where q is any fixed prime, and anyepsilon > 0, polar codes allow efficient data compression of N i.i.d. copies of X into (H(X)+epsilon)N q-ary symbols, as soon as N is polynomially large in 1/epsilon. We can get capacity-achieving source codes with similar guarantees for composite alphabets, by factoring q into primes and combining different polar codes for each prime in factorization.

A consequence of our result for noisy channel coding is that for all discrete memoryless channels, there are explicit codes enabling reliable communication within epsilon > 0 of the symmetric Shannon capacity for a block length and decoding complexity bounded by a polynomial in 1/epsilon. The result was previously shown for the special case of binary-input channels [Guruswami/Xial, FOCS'13; Hassani/Alishahi/Urbanke, CoRR 2013], and this work extends the result to channels over any alphabet.

### BibTeX - Entry

@InProceedings{guruswami_et_al:LIPIcs:2015:5075,
author =	{Venkatesan Guruswami and Ameya Velingker},
title =	{{An Entropy Sumset Inequality and Polynomially Fast Convergence to Shannon Capacity Over All Alphabets}},
booktitle =	{30th Conference on Computational Complexity (CCC 2015)},
pages =	{42--57},
series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
ISBN =	{978-3-939897-81-1},
ISSN =	{1868-8969},
year =	{2015},
volume =	{33},
editor =	{David Zuckerman},
publisher =	{Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik},