An Efficient and Distribution-Free Two-Sample Test Based on Energy Statistics and Random Projections
Abstract
A common disadvantage in existing distribution-free two-sample testing approaches is that the computational complexity could be high. Specifically, if the sample size is N, the computational complexity of those two-sample tests is at least O(N2). In this paper, we develop an efficient algorithm with complexity O(N N) for computing energy statistics in univariate cases. For multivariate cases, we introduce a two-sample test based on energy statistics and random projections, which enjoys the O(K N N) computational complexity, where K is the number of random projections. We name our method for multivariate cases as Randomly Projected Energy Statistics (RPES). We can show RPES achieves nearly the same test power with energy statistics both theoretically and empirically. Numerical experiments also demonstrate the efficiency of the proposed method over the competitors.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.