Diffusion maps - A probabilistic interpretation for spectral embedding and clustering algorithms

37Citations
Citations of this article
114Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Spectral embedding and spectral clustering are common methods for non-linear dimensionality reduction and clustering of complex high dimensional datasets. In this paper we provide a diffusion based probabilistic analysis of algorithms that use the normalized graph Laplacian. Given the pairwise adjacency matrix of all points in a dataset, we define a random walk on the graph of points and a diffusion distance between any two points. We show that the diffusion distance is equal to the Euclidean distance in the embedded space with all eigenvectors of the normalized graph Laplacian. This identity shows that characteristic relaxation times and processes of the random walk on the graph are the key concept that governs the properties of these spectral clustering and spectral embedding algorithms. Specifically, for spectral clustering to succeed, a necessary condition is that the mean exit times from each cluster need to be significantly larger than the largest (slowest) of all relaxation times inside all of the individual clusters. For complex, multiscale data, this condition may not hold and multiscale methods need to be developed to handle such situations.

Cite

CITATION STYLE

APA

Nadler, B., Lafon, S., Coifman, R., & Kevrekidis, I. G. (2008). Diffusion maps - A probabilistic interpretation for spectral embedding and clustering algorithms. In Lecture Notes in Computational Science and Engineering (Vol. 58, pp. 238–260). https://doi.org/10.1007/978-3-540-73750-6_10

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free