Universal Regular Conditional Distributions

Abstract

We introduce a deep learning model that can universally approximate regular conditional distributions (RCDs). The proposed model operates in three phases: first, it linearizes inputs from a given metric space X to Rd via a feature map, then a deep feedforward neural network processes these linearized features, and then the network's outputs are then transformed to the 1-Wasserstein space P1(RD) via a probabilistic extension of the attention mechanism of Bahdanau et al.\ (2014). Our model, called the probabilistic transformer (PT), can approximate any continuous function from Rd to P1(RD) uniformly on compact sets, quantitatively. We identify two ways in which the PT avoids the curse of dimensionality when approximating P1(RD)-valued functions. The first strategy builds functions in C(Rd,P1(RD)) which can be efficiently approximated by a PT, uniformly on any given compact subset of Rd. In the second approach, given any function f in C(Rd,P1(RD)), we build compact subsets of Rd whereon f can be efficiently approximated by a PT.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…