Abstract
We prove a lower estimate on the increase in entropy when two copies of a conditional random variable X  Y, with X supported on Z_q={0,1,...,q1} for prime q, are summed modulo q. Specifically, given two i.i.d. copies (X_1,Y_1) and (X_2,Y_2) of a pair of random variables (X,Y), with X taking values in Z_q, we show
H(X_1 + X_2 \mid Y_1, Y_2)  H(XY) >=e alpha(q) * H(XY) (1H(XY))
for some alpha(q) > 0, where H(.) is the normalized (by factor log_2(q)) entropy. In particular, if X  Y is not close to being fully random or fully deterministic and H(X Y) \in (gamma,1gamma), then the entropy of the sum increases by Omega_q(gamma). Our motivation is an effective analysis of the finitelength behavior of polar codes, for which the linear dependence on gamma is quantitatively important. The assumption of q being prime is necessary: for X supported uniformly on a proper subgroup of Z_q we have H(X+X)=H(X). For X supported on infinite groups without a finite subgroup (the torsionfree case) and no conditioning, a sumset inequality for the absolute increase in (unnormalized) entropy was shown by Tao in [Tao, CP&R 2010].
We use our sumset inequality to analyze Ari kan's construction of polar codes and prove that for any qary source X, where q is any fixed prime, and anyepsilon > 0, polar codes allow efficient data compression of N i.i.d. copies of X into (H(X)+epsilon)N qary symbols, as soon as N is polynomially large in 1/epsilon. We can get capacityachieving source codes with similar guarantees for composite alphabets, by factoring q into primes and combining different polar codes for each prime in factorization.
A consequence of our result for noisy channel coding is that for all discrete memoryless channels, there are explicit codes enabling reliable communication within epsilon > 0 of the symmetric Shannon capacity for a block length and decoding complexity bounded by a polynomial in 1/epsilon. The result was previously shown for the special case of binaryinput channels [Guruswami/Xial, FOCS'13; Hassani/Alishahi/Urbanke, CoRR 2013], and this work extends the result to channels over any alphabet.
BibTeX  Entry
@InProceedings{guruswami_et_al:LIPIcs:2015:5075,
author = {Venkatesan Guruswami and Ameya Velingker},
title = {{An Entropy Sumset Inequality and Polynomially Fast Convergence to Shannon Capacity Over All Alphabets}},
booktitle = {30th Conference on Computational Complexity (CCC 2015)},
pages = {4257},
series = {Leibniz International Proceedings in Informatics (LIPIcs)},
ISBN = {9783939897811},
ISSN = {18688969},
year = {2015},
volume = {33},
editor = {David Zuckerman},
publisher = {Schloss DagstuhlLeibnizZentrum fuer Informatik},
address = {Dagstuhl, Germany},
URL = {http://drops.dagstuhl.de/opus/volltexte/2015/5075},
URN = {urn:nbn:de:0030drops50755},
doi = {10.4230/LIPIcs.CCC.2015.42},
annote = {Keywords: Polar codes, polynomial gap to capacity, entropy sumset inequality, arbitrary alphabets}
}
Keywords: 

Polar codes, polynomial gap to capacity, entropy sumset inequality, arbitrary alphabets 
Collection: 

30th Conference on Computational Complexity (CCC 2015) 
Issue Date: 

2015 
Date of publication: 

06.06.2015 