On the Parameterized Complexity of Polytree Learning
Abstract
A Bayesian network is a directed acyclic graph that represents statistical dependencies between variables of a joint probability distribution. A fundamental task in data science is to learn a Bayesian network from observed data. Polytree Learning is the problem of learning an optimal Bayesian network that fulfills the additional property that its underlying undirected graph is a forest. In this work, we revisit the complexity of Polytree Learning. We show that Polytree Learning can be solved in 3n · |I|O(1) time where n is the number of variables and |I| is the total instance size. Moreover, we consider the influence of the number of variables d that might receive a nonempty parent set in the final DAG on the complexity of Polytree Learning. We show that Polytree Learning has no f(d)· |I|O(1)-time algorithm, unlike Bayesian network learning which can be solved in 2d · |I|O(1) time. We show that, in contrast, if d and the maximum parent set size are bounded, then we can obtain efficient algorithms.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.