Three Costs of Amortizing Gaussian Process Inference with Neural Processes

Robin Young

Three Costs of Amortizing Gaussian Process Inference with Neural Processes

Abstract

Neural processes amortize Gaussian process inference, replacing the exact O(n3) posterior with a learned O(n) map from context sets to predictive distributions. For a class of latent neural processes, we bound the Kullback--Leibler (KL) divergence between the GP and LNP predictives, decomposing it into three interpretable sources, namely label contamination as the neural process uses label values to estimate a quantity that is label-independent in the exact GP, an information bottleneck because the finite-dimensional representation cannot resolve the full context geometry, and amortization error from a single encoder network shared across all contexts. The bottleneck truncation term decays in the representation dimension d as O(e-cd2/dx) for squared-exponential kernels on Rdx where c > 0 is a kernel-dependent constant and as O(d-2ν/dx) for Matérn-ν kernels, directly linking architecture sizing to kernel smoothness and input dimension. The label contamination term is O(1) in general, with only the observation-noise component decaying as O(1/n), identifying a persistent cost of routing uncertainty estimation through a label-dependent representation. These results characterize the costs of amortization within the analyzed class and yield architectural recommendations to predict variance from context locations alone in the GP-amortization regime, and replace mean aggregation with second-order pooling to close the dominant amortization gap.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Or compile a full topic from this idea

Discussion (0)

Sign in to join the discussion.

Loading comments…