Detecting Correlated Gaussian Databases

Abstract

This paper considers the problem of detecting whether two databases, each consisting of n users with d Gaussian features, are correlated. Under the null hypothesis, the databases are independent. Under the alternate hypothesis, the features are correlated across databases, under an unknown row permutation. A simple test is developed to show that detection is achievable above 2 ≈ 1d. For the converse, the truncated second moment method is used to establish that detection is impossible below roughly 2 ≈ 1dn. These results are compared to the corresponding recovery problem, where the goal is to decode the row permutation, and a converse bound of roughly 2 ≈ 1 - n-4/d has been previously shown. For certain choices of parameters, the detection achievability bound outperforms this recovery converse bound, demonstrating that detection can be easier than recovery in this scenario.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…