Visualization of SNPs with t-SNE

66Citations
Citations of this article
192Readers
Mendeley users who have this article in their library.

Abstract

Background: Single Nucleotide Polymorphisms (SNPs) are one of the largest sources of new data in biology. In most papers, SNPs between individuals are visualized with Principal Component Analysis (PCA), an older method for this purpose. Principal Findings: We compare PCA, an aging method for this purpose, with a newer method, t-Distributed Stochastic Neighbor Embedding (t-SNE) for the visualization of large SNP datasets. We also propose a set of key figures for evaluating these visualizations; in all of these t-SNE performs better. Significance: To transform data PCA remains a reasonably good method, but for visualization it should be replaced by a method from the subfield of dimension reduction. To evaluate the performance of visualization, we propose key figures of cross-validation with machine learning methods, as well as indices of cluster validity. © 2013 Alexander Platzer.

Cite

CITATION STYLE

APA

Platzer, A. (2013). Visualization of SNPs with t-SNE. PLoS ONE, 8(2). https://doi.org/10.1371/journal.pone.0056883

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free