Unifilar Machines and the Adjoint Structure of Bayesian Filtering
Abstract
We elucidate the mathematical structure of Bayesian filtering, and Bayesian inference more broadly, by applying recent work on category theoretical probability, specifically the concept of a strongly representable Markov category. We show that filtering, along with related concepts such as conjugate priors, arise from an adjunction: the process of taking a hidden Markov process is right adjoint to a forgetful functor. This has an interesting consequence. In practice, filtering is usually implemented using parametrised families of distributions. The Kalman filter is a particularly important example, which uses Gaussians. Rather than calculating a new posterior each time, the implementation only needs to udpate the parameters. This structure arises naturally from our adjunction; the correctness of such a model is witnessed by a map from the model into the system being modelled. Conjugate priors arise from this construction as a special case. In showing this we define a notion of unifilar machine, which has its origins in the literature on epsilon-machines. Unifilar machines are useful as models of the "observable behaviour" of stochastic systems; we show additionally that in the Kleisli category of the distribution monad there is a terminal unifilar machine, and its elements are controlled stochastic processes, mapping sequences of the input alphabet probabilistically to sequences of the output alphabet.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.