Minimax Estimation of Kernel Mean Embeddings
Abstract
In this paper, we study the minimax estimation of the Bochner integral μk(P):=∫X k(·,x)\,dP(x), also called as the kernel mean embedding, based on random samples drawn i.i.d.~from P, where k:X×X→R is a positive definite kernel. Various estimators (including the empirical estimator), θn of μk(P) are studied in the literature wherein all of them satisfy \| θn-μk(P)\|Hk=OP(n-1/2) with Hk being the reproducing kernel Hilbert space induced by k. The main contribution of the paper is in showing that the above mentioned rate of n-1/2 is minimax in \|·\|Hk and \|·\|L2(Rd)-norms over the class of discrete measures and the class of measures that has an infinitely differentiable density, with k being a continuous translation-invariant kernel on Rd. The interesting aspect of this result is that the minimax rate is independent of the smoothness of the kernel and the density of P (if it exists). This result has practical consequences in statistical applications as the mean embedding has been widely employed in non-parametric hypothesis testing, density estimation, causal inference and feature selection, through its relation to energy distance (and distance covariance).
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.