Bayesian Hypernetworks

Aaron Courville

Bayesian Hypernetworks

Abstract

We study Bayesian hypernetworks: a framework for approximate Bayesian inference in neural networks. A Bayesian hypernetwork is a neural network which learns to transform a simple noise distribution, p(ε) = ( 0, I), to a distribution q() := q(h(ε)) over the parameters of another neural network (the "primary network")\@. We train q with variational inference, using an invertible to enable efficient estimation of the variational lower bound on the posterior p( | ) via sampling. In contrast to most methods for Bayesian deep learning, Bayesian hypernets can represent a complex multimodal approximate posterior with correlations between parameters, while enabling cheap iid sampling of~q(). In practice, Bayesian hypernets can provide a better defense against adversarial examples than dropout, and also exhibit competitive performance on a suite of tasks which evaluate model uncertainty, including regularization, active learning, and anomaly detection.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Or compile a full topic from this idea

Discussion (0)

Sign in to join the discussion.

Loading comments…