Learning from MOM's principles: Le Cam's approach
Abstract
We obtain estimation error rates for estimators obtained by aggregation of regularized median-of-means tests, following a construction of Le Cam. The results hold with exponentially large probability -- as in the gaussian framework with independent noise- under only weak moments assumptions on data and without assuming independence between noise and design. Any norm may be used for regularization. When it has some sparsity inducing power we recover sparse rates of convergence. The procedure is robust since a large part of data may be corrupted, these outliers have nothing to do with the oracle we want to reconstruct. Our general risk bound is of order equation* (minimax rate in the i.i.d. setup, number of outliersnumber of observations) . equation*In particular, the number of outliers may be as large as (number of data) ×(minimax rate) without affecting this rate. The other data do not have to be identically distributed but should only have equivalent L1 and L2 moments. For example, the minimax rate s (ed/s)/N of recovery of a s-sparse vector in Rd is achieved with exponentially large probability by a median-of-means version of the LASSO when the noise has q0 moments for some q0>2, the entries of the design matrix should have C0(ed) moments and the dataset can be corrupted up to C1 s (ed/s) outliers.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.