Efficiency of ANS Entropy Encoders

Abstract

Asymmetric Numeral Systems (ANS) is a class of entropy encoders that had an immense impact on the data compression, substituting arithmetic and Huffman coding. It was studied by different authors but the precise asymptotics of its redundancy (in relation to the entropy) was not completely understood. We obtain optimal bounds for the redundancy of the tabled ANS (tANS), the most popular ANS variant. Given a sequence a1,a2,…,an of symbols from an alphabet \0,1,…,σ-1\ such that each symbol a occurs in it fa times and n=2r, the tANS encoder using Duda's ``precise initialization'' to fill tANS tables transforms this sequence into a bit string of the following length (the frequencies are not included in the encoding): Σa∈[0..σ)fa·nfa+O(σ+r), where O(σ+r) can be bounded by σ e+r. The r-bit term is an artifact indispensable to ANS; the rest incurs a redundancy of O(σn) bits per symbol. We complement this by examples showing that an (σ+r) redundancy is necessary. We argue that similar examples exist for most adequate initialization methods for tANS. Thus, we refute Duda's conjecture that the redundancy is O(σn2) bits per symbol. We also propose a variant of the range ANS (rANS), called rANS with fixed accuracy, parameterized by k 1 that in certain conditions might be faster than the standard rANS because it avoids slow explicit division operations. We bound the redundancy for our rANS variant by n2k-1 e+r+k.

0

Discussion (0)

Sign in to join the discussion.

Loading comments…