Efficient Distance Approximation for Structured High-Dimensional Distributions via Learning
Abstract
We design efficient distance approximation algorithms for several classes of structured high-dimensional distributions. Specifically, we show algorithms for the following problems: - Given sample access to two Bayesian networks P1 and P2 over known directed acyclic graphs G1 and G2 having n nodes and bounded in-degree, approximate dtv(P1,P2) to within additive error ε using poly(n,ε) samples and time - Given sample access to two ferromagnetic Ising models P1 and P2 on n variables with bounded width, approximate dtv(P1, P2) to within additive error ε using poly(n,ε) samples and time - Given sample access to two n-dimensional Gaussians P1 and P2, approximate dtv(P1, P2) to within additive error ε using poly(n,ε) samples and time - Given access to observations from two causal models P and Q on n variables that are defined over known causal graphs, approximate dtv(Pa, Qa) to within additive error ε using poly(n,ε) samples, where Pa and Qa are the interventional distributions obtained by the intervention do(A=a) on P and Q respectively for a particular variable A. Our results are the first efficient distance approximation algorithms for these well-studied problems. They are derived using a simple and general connection to distribution learning algorithms. The distance approximation algorithms imply new efficient algorithms for tolerant testing of closeness of the above-mentioned structured high-dimensional distributions.