Performance of empirical risk minimization in linear aggregation
Abstract
We study conditions under which, given a dictionary F=\f1,… ,fM\ and an i.i.d. sample (Xi,Yi)i=1N, the empirical minimizer in span(F) relative to the squared loss, satisfies that with high probability \[R(fERM)≤∈ff∈ span(F)R(f)+rN(M),\] where R(·) is the squared risk and rN(M) is of the order of M/N. Among other results, we prove that a uniform small-ball estimate for functions in span(F) is enough to achieve that goal when the noise is independent of the design.
0
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.