Convergence of Contrastive Divergence with Annealed Learning Rate in Exponential Family

Abstract

In our recent paper, we showed that in exponential family, contrastive divergence (CD) with fixed learning rate will give asymptotically consistent estimates wu2016convergence. In this paper, we establish consistency and convergence rate of CD with annealed learning rate ηt. Specifically, suppose CD-m generates the sequence of parameters \θt\t 0 using an i.i.d. data sample X1n pθ* of size n, then δn(X1n) = t ∞ Σs=t0t ηs θs / Σs=t0t ηs - θ* converges in probability to 0 at a rate of 1/[3]n. The number (m) of MCMC transitions in CD only affects the coefficient factor of convergence rate. Our proof is not a simple extension of the one in wu2016convergence. which depends critically on the fact that \θt\t 0 is a homogeneous Markov chain conditional on the observed sample X1n. Under annealed learning rate, the homogeneous Markov property is not available and we have to develop an alternative approach based on super-martingales. Experiment results of CD on a fully-visible 2× 2 Boltzmann Machine are provided to demonstrate our theoretical results.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…