A filtering technique for Markov chains with applications to spectral embedding
Abstract
Spectral methods have proven to be a highly effective tool in understanding the intrinsic geometry of a high-dimensional data set \xi \i=1n ⊂ Rd. The key ingredient is the construction of a Markov chain on the set, where transition probabilities depend on the distance between elements, for example where for every 1 ≤ j ≤ n the probability of going from xj to xi is proportional to pij ( -1\|xi -xj\|22(Rd)) where~>0~is a free parameter. We propose a method which increases the self-consistency of such Markov chains before spectral methods are applied. Instead of directly using a Markov transition matrix P, we set pii = 0 and rescale, thereby obtaining a transition matrix P* modeling a non-lazy random walk. We then create a new transition matrix Q = (qij)i,j=1n by demanding that for fixed j the quantity qij be proportional to qij ((P*)ij, ((P*)2)ij, …, ((P*)k)ij) where usually~ k=2. We consider several classical data sets, show that this simple method can increase the efficiency of spectral methods and prove that it can correct randomly introduced errors in the kernel.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.