Examining intermediate data reduction algorithms for use with t-SNE

Aaron Campbell; Kyle Caudle; Randy C. Hoover

Conference Proceedings

Examining intermediate data reduction algorithms for use with t-SNE

ACM International Conference Proceeding Series (2019) 36-42

DOI: 10.1145/3314545.3314549

4Citations

13Readers

Get full text

Abstract

T-distributed Stochastic Neighbor Embedding (t-SNE) is a data visualization tool that was developed to provide a flexible, non-parametric method for mapping high dimensional data onto a two or three dimensional subspace for data visualization. This paper observes the effects of using different intermediate data reduction algorithms (e.g., Principal Component Analysis, independent Component Analysis, Linear Discriminant Analysis, Sammon Mapping, and Local Linear Embedding) to first reduce the data to an intermediate subspace prior to applying t-SNE for visualization. Our research shows that no intermediate step in the visualization process is trivial, and application dependent knowledge should be utilized to ensure the best possible visualization in lower dimensional spaces. Experimental results are presented for several common data sets where we illustrate that, for clustering applications and visualization of class separation of multi-class data, each algorithm tested results in significantly different mappings.

Author supplied keywords

Cite

CITATION STYLE

APA

Campbell, A., Caudle, K., & Hoover, R. C. (2019). Examining intermediate data reduction algorithms for use with t-SNE. In ACM International Conference Proceeding Series (pp. 36–42). Association for Computing Machinery. https://doi.org/10.1145/3314545.3314549

Examining intermediate data reduction algorithms for use with t-SNE

Abstract

Author supplied keywords

Cite

Register to see more suggestions