On the Posterior Distribution of a Random Process Conditioned on Empirical Frequencies of a Finite Path: the i.i.d and finite Markov chain case

Abstract

We obtain the posterior distribution of a random process conditioned on observing the empirical frequencies of a finite sample path. We find under a rather broad assumption on the "dependence structure" of the process, c.f. independence or Markovian, the posterior marginal distribution of the process at a given time index can be identified as certain empirical distribution computed from the observed empirical frequencies of the sample path. We show that in both cases of discrete-valued i.i.d. sequence and finite Markov chain, a certain "conditional symmetry" given by the observation of the empirical frequencies leads to the desired result on the posterior distribution. Results for both finite-time observations and its asymptotic infinite-time limit are connected via the idea of Gibbs conditioning. Finally, since our results demonstrate a central role of the empirical frequency in understanding the information content of data, we use the Large Deviations Principle (LDP) to construct a general notion of "data-driven entropy", from which one can apply a formalism from the recent study of statistical thermodynamics to data.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…