Nonasymptotic one-and two-sample tests in high dimension with unknown covariance structure
Abstract
Let X = (Xi)1≤ i ≤ n be an i.i.d. sample of square-integrable variables in Rd, with common expectation μ and covariance matrix , both unknown. We consider the problem of testing if μ is η-close to zero, i.e. \|μ\| ≤ η against \|μ\| ≥ (η + δ); we also tackle the more general two-sample mean closeness (also known as relevant difference) testing problem. The aim of this paper is to obtain nonasymptotic upper and lower bounds on the minimal separation distance δ such that we can control both the Type I and Type II errors at a given level. The main technical tools are concentration inequalities, first for a suitable estimator of \|μ\|2 used a test statistic, and secondly for estimating the operator and Frobenius norms of coming into the quantiles of said test statistic. These properties are obtained for Gaussian and bounded distributions. A particular attention is given to the dependence in the pseudo-dimension d* of the distribution, defined as d* := \|\|22/\|\|∞2. In particular, for η=0, the minimum separation distance is ( d*14\|\|∞/n), in contrast with the minimax estimation distance for μ, which is (de12\|\|∞/n) (where de:=\|\|1/\|\|∞). This generalizes a phenomenon spelled out in particular by Baraud (2002).
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.