A Data-Transparent Probabilistic Model of Temporal Propositional Abstraction
Abstract
Standard probabilistic models face fundamental challenges such as data scarcity, a large hypothesis space, and poor data transparency. To address these challenges, we propose a novel probabilistic model of data-driven temporal propositional reasoning. Unlike conventional probabilistic models where data is a product of domain knowledge encoded in the probabilistic model, we explore the reverse direction where domain knowledge is a product of data encoded in the probabilistic model. This more data-driven perspective suggests no distinction between maximum likelihood parameter learning and temporal propositional reasoning. We show that our probabilistic model is equivalent to a highest-order, i.e., full-memory, Markov chain, and our model requires no distinction between hidden and observable variables. We discuss that limits provide a natural and mathematically rigorous way to handle data scarcity, including the zero-frequency problem. We also discuss that a probability distribution over data generated by our probabilistic model helps data transparency by revealing influential data used in predictions. The reproducibility of this theoretical work is fully demonstrated by the included proofs.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.