Improved Stein Variational Gradient Descent with Importance Weights
Abstract
Stein Variational Gradient Descent (SVGD) is a popular sampling algorithm used in various machine learning tasks. It is well known that SVGD arises from a discretization of the kernelized gradient flow of the Kullback-Leibler divergence DKL(·π), where π is the target distribution. In this work, we propose to enhance SVGD via the introduction of importance weights, which leads to a new method for which we coin the name β-SVGD. In the continuous time and infinite particles regime, the time for this flow to converge to the equilibrium distribution π, quantified by the Stein Fisher information, depends on 0 and π very weakly. This is very different from the kernelized gradient flow of Kullback-Leibler divergence, whose time complexity depends on DKL(0π). Under certain assumptions, we provide a descent lemma for the population limit β-SVGD, which covers the descent lemma for the population limit SVGD when β 0. We also illustrate the advantages of β-SVGD over SVGD by experiments.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.