Precise Bayesian Neural Networks
Abstract
Despite its long history, Bayesian neural networks (BNNs) and variational training remain underused in practice: standard Gaussian posteriors misalign with network geometry, KL terms can be brittle in high dimensions, and implementations often add complexity without reliably improving uncertainty. We revisit the problem through the lens of normalization. Because normalization layers neutralize the influence of weight magnitude, we model uncertainty only in weight directions using a von Mises-Fisher posterior on the unit sphere. High-dimensional geometry then yields a single, interpretable scalar per layer--the effective post-normalization noise σeff--that (i) corresponds to simple additive Gaussian noise in the forward pass and (ii) admits a compact, dimension-aware KL in closed form. We derive accurate, closed-form approximations linking concentration to activation variance and to σeff across regimes, producing a lightweight, implementation-ready variational unit that fits modern normalized architectures and improves calibration without sacrificing accuracy. This dimension awareness is critical for stable optimization in high dimensions. In short, by aligning the variational posterior with the network's intrinsic geometry, BNNs can be simultaneously principled, practical, and precise.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.