Learning Feature Hierarchies with Centered Deep Boltzmann Machines

Abstract

Deep Boltzmann machines are in principle powerful models for extracting the hierarchical structure of data. Unfortunately, attempts to train layers jointly (without greedy layer-wise pretraining) have been largely unsuccessful. We propose a modification of the learning algorithm that initially recenters the output of the activation functions to zero. This modification leads to a better conditioned Hessian and thus makes learning easier. We test the algorithm on real data and demonstrate that our suggestion, the centered deep Boltzmann machine, learns a hierarchy of increasingly abstract representations and a better generative model of data.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…