Heavy-Tailed Kernels Reveal a Finer Cluster Structure in t-SNE Visualisations

Dmitry Kobak; George Linderman; Stefan Steinerberger; Yuval Kluger; Philipp Berens

Conference ProceedingsOPEN ACCESS

Heavy-Tailed Kernels Reveal a Finer Cluster Structure in t-SNE Visualisations

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2020) 11906 LNAI 124-139

DOI: 10.1007/978-3-030-46150-8_8

22Citations

62Readers

Abstract

T-distributed stochastic neighbour embedding (t-SNE) is a widely used data visualisation technique. It differs from its predecessor SNE by the low-dimensional similarity kernel: the Gaussian kernel was replaced by the heavy-tailed Cauchy kernel, solving the ‘crowding problem’ of SNE. Here, we develop an efficient implementation of t-SNE for a t-distribution kernel with an arbitrary degree of freedom ν, with ν→∞ corresponding to SNE and ν=1 corresponding to the standard t-SNE. Using theoretical analysis and toy examples, we show that ν<1 can further reduce the crowding problem and reveal finer cluster structure that is invisible in standard t-SNE. We further demonstrate the striking effect of heavier-tailed kernels on large real-life data sets such as MNIST, single-cell RNA-sequencing data, and the HathiTrust library. We use domain knowledge to confirm that the revealed clusters are meaningful. Overall, we argue that modifying the tail heaviness of the t-SNE kernel can yield additional insight into the cluster structure of the data.

Author supplied keywords

Cite

CITATION STYLE

APA

Kobak, D., Linderman, G., Steinerberger, S., Kluger, Y., & Berens, P. (2020). Heavy-Tailed Kernels Reveal a Finer Cluster Structure in t-SNE Visualisations. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11906 LNAI, pp. 124–139). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-46150-8_8

Heavy-Tailed Kernels Reveal a Finer Cluster Structure in t-SNE Visualisations

Abstract

Author supplied keywords

Cite

Register to see more suggestions