Sharp Variable Selection of a Sparse Submatrix in a High-Dimensional Noisy Matrix
Abstract
We observe a N× M matrix of independent, identically distributed Gaussian random variables which are centered except for elements of some submatrix of size n× m where the mean is larger than some a>0. The submatrix is sparse in the sense that n/N and m/M tend to 0, whereas n,\, m, \, N and M tend to infinity. We consider the problem of selecting the random variables with significantly large mean values. We give sufficient conditions on a as a function of n,\, m,\,N and M and construct a uniformly consistent procedure in order to do sharp variable selection. We also prove the minimax lower bounds under necessary conditions which are complementary to the previous conditions. The critical values a* separating the necessary and sufficient conditions are sharp (we show exact constants). We note a gap between the critical values a* for selection of variables and that of detecting that such a submatrix exists given by Butucea and Ingster (2012). When a* is in this gap, consistent detection is possible but no consistent selector of the corresponding variables can be found.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.