Kernel Two-Sample Testing via Directional Components Analysis
Abstract
Standard kernel two-sample tests, such as those based on the Maximum Mean Discrepancy (MMD), aggregate squared differences across all directions in a Reproducing Kernel Hilbert Space (RKHS). However, in finite samples, trailing directional components are noisy, which degrades test power. We propose a novel kernel-based test that resolves this by truncating the spectral decomposition of the MMD, retaining only the well-estimated leading eigen-directions. By aggregating these robust components, our method achieves superior power and robustness, particularly in high-dimensional and unbalanced settings. Furthermore, we introduce a computationally efficient parametric bootstrap procedure for approximating critical values, which is theoretically justified and significantly faster than permutation-based alternatives. Extensive simulations and empirical studies demonstrate that our method maintains strict Type I error control while delivering higher power than existing MMD-based tests.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.