Near-Optimal Learning of Tree-Structured Distributions by Chow-Liu
Abstract
We provide finite sample guarantees for the classical Chow-Liu algorithm (IEEE Trans.~Inform.~Theory, 1968) to learn a tree-structured graphical model of a distribution. For a distribution P on n and a tree T on n nodes, we say T is an -approximate tree for P if there is a T-structured distribution Q such that D(P\;||\;Q) is at most more than the best possible tree-structured distribution for P. We show that if P itself is tree-structured, then the Chow-Liu algorithm with the plug-in estimator for mutual information with O(||3 n-1) i.i.d.~samples outputs an -approximate tree for P with constant probability. In contrast, for a general P (which may not be tree-structured), (n2-2) samples are necessary to find an -approximate tree. Our upper bound is based on a new conditional independence tester that addresses an open problem posed by Canonne, Diakonikolas, Kane, and Stewart~(STOC, 2018): we prove that for three random variables X,Y,Z each over , testing if I(X; Y Z) is 0 or ≥ is possible with O(||3/) samples. Finally, we show that for a specific tree T, with O (||2n-1) samples from a distribution P over n, one can efficiently learn the closest T-structured distribution in KL divergence by applying the add-1 estimator at each node.