On concentration for (regularized) empirical risk minimization
Abstract
Rates of convergence for empirical risk minimizers have been well studied in the literature. In this paper, we aim to provide a complementary set of results, in particular by showing that after normalization, the risk of the empirical minimizer concentrates on a single point. Such results have been established by~chatterjee2014new for constrained estimators in the normal sequence model. We first generalize and sharpen this result to regularized least squares with convex penalties, making use of a "direct" argument based on Borell's theorem. We then study generalizations to other loss functions, including the negative log-likelihood for exponential families combined with a strictly convex regularization penalty. The results in this general setting are based on more "indirect" arguments as well as on concentration inequalities for maxima of empirical processes.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.