Minimum Description Length Principle for Maximum Entropy Model Selection
Abstract
Model selection is central to statistics, and many learning problems can be formulated as model selection problems. In this paper, we treat the problem of selecting a maximum entropy model given various feature subsets and their moments, as a model selection problem, and present a minimum description length (MDL) formulation to solve this problem. For this, we derive normalized maximum likelihood (NML) codelength for these models. Furthermore, we prove that the minimax entropy principle is a special case of maximum entropy model selection, where one assumes that complexity of all the models are equal. We apply our approach to gene selection problem and present simulation results.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.