Non-asymptotic estimates of the minimal risk in statistical learning

Abstract

In this paper we prove some concentration inequalities for two types of error probabilities in the Empirical Risk Principle (ERP) in statistical learning, which provide a lower bound and an upper bound for the minimal risk (in terms of the minimal empirical risk) with non-asymptotic high confidence. The usual boundedness condition of the empirical risk function is relaxed to the Gaussian or exponential integrability condition. The confidence of the lower bound of the minimal risk is shown to be independent of the number of training parameters and the dimension of the input vectors, allowing one to detect the deficiency of a learning machine efficiently; and the confidence of the upper bound of the minimal risk is proved to be high provided that the sample size n is much greater than the box dimension of the parameter set Θ in the Orlicz metric dψ1 associated with the risk functions. Our work is based on Talagrand's concentration inequalities (the sharp versions by Bousquet and Klein-Rio), transport-entropy inequalities and the recent progress in the theory of empirical processes and statistical learning.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…