Strengthened Information-theoretic Bounds on the Generalization Error
Abstract
The following problem is considered: given a joint distribution PXY and an event E, bound PXY(E) in terms of PXPY(E) (where PXPY is the product of the marginals of PXY) and a measure of dependence of X and Y. Such bounds have direct applications in the analysis of the generalization error of learning algorithms, where E represents a large error event and the measure of dependence controls the degree of overfitting. Herein, bounds are demonstrated using several information-theoretic metrics, in particular: mutual information, lautum information, maximal leakage, and J∞. The mutual information bound can outperform comparable bounds in the literature by an arbitrarily large factor.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.