On information gain, Kullback-Leibler divergence, entropy production and the involution kernel
Abstract
It is well known that in Information Theory and Machine Learning the Kullback-Leibler divergence, which extends the concept of Shannon entropy, plays a fundamental role. Given an a priori probability kernel and a probability π on the measurable space X× Y we consider an appropriate definition of entropy of π relative to , which is based on previous works. Using this concept of entropy we obtain a natural definition of information gain for general measurable spaces which coincides with the mutual information given from the K-L divergence in the case is identified with a probability on X. This will be used to extend the meaning of specific information gain and dynamical entropy production to the model of thermodynamic formalism for symbolic dynamics over a compact alphabet (TFCA model). In this case, we show that the involution kernel is a natural tool for better understanding some important properties of entropy production.