Intrinsic t-Stochastic Neighbor Embedding for visualization and outlier detection: A remedy against the curse of dimensionality?

33Citations
Citations of this article
25Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Analyzing high-dimensional data poses many challenges due to the “curse of dimensionality”. Not all high-dimensional data exhibit these characteristics because many data sets have correlations, which led to the notion of intrinsic dimensionality. Intrinsic dimensionality describes the local behavior of data on a low-dimensional manifold within the higher dimensional space. We discuss this effect, and describe a surprisingly simple approach modification that allows us to reduce local intrinsic dimensionality of individual points. While this unlikely will be able to “cure” all problems associated with high dimensionality, we show the theoretical impact on idealized distributions and how to practically incorporate it into new, more robust, algorithms. To demonstrate the effect of this adjustment, we introduce the novel Intrinsic Stochastic Outlier Score (ISOS), and we propose modifications of the popular t-Stochastic Neighbor Embedding (t-SNE) visualization technique for intrinsic dimensionality, intrinsic t-Stochastic Neighbor Embedding (it-SNE).

Cite

CITATION STYLE

APA

Schubert, E., & Gertz, M. (2017). Intrinsic t-Stochastic Neighbor Embedding for visualization and outlier detection: A remedy against the curse of dimensionality? In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10609 LNCS, pp. 188–203). Springer Verlag. https://doi.org/10.1007/978-3-319-68474-1_13

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free