The Geometry of Statistical Data and Information: A Large Deviation Perspective

Abstract

The manifold of empirical mean values of statistical data ad infinitum has a geometric shape that depends on the probability measure that governs the generating model. Large deviation theory produces entropy functions that depend on both the probability measure and the statistical data; we use entropy to study the geometry of the data space rather than that of the space of probability distributions. It is well known, since Rao's work, that the Fisher-Rao metric makes the probability simplex into a sphere. From our perspective, that result translates to the space of empirical singleton counting frequencies under an i.i.d. assumption. Following our ideas and going beyond i.i.d., the choice of measure curves the space. When we study the pairwise statistics, the spherical geometry breaks down entirely. We show that the information projection, defined in information geometry as divergence minimization, coincides with the information projection in Kolmogorov's probability theory. This identification holds under both i.i.d. and Markovian assumptions and connects information geometry to the foundations of probability theory.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…