Quantization via Empirical Divergence Maximization
Abstract
Empirical divergence maximization (EDM) refers to a recently proposed strategy for estimating f-divergences and likelihood ratio functions. This paper extends the idea to empirical vector quantization where one seeks to empirically derive quantization rules that maximize the Kullback-Leibler divergence between two statistical hypotheses. We analyze the estimator's error convergence rate leveraging Tsybakov's margin condition and show that rates as fast as 1/n are possible, where n equals the number of training samples. We also show that the Flynn and Gray algorithm can be used to efficiently compute EDM estimates and show that they can be efficiently and accurately represented by recursive dyadic partitions. The EDM formulation have several advantages. First, the formulation gives access to the tools and results of empirical process theory that quantify the estimator's error convergence rate. Second, the formulation provides a previously unknown derivation for the Flynn and Gray algorithm. Third, the flexibility it affords allows one to avoid a small-cell assumption common in other approaches. Finally, we illustrate the potential use of the method through an example.