Phasing of 2-SNP genotypes based on non-random mating model

2Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Emerging microarray technologies allow genotyping of long genome sequences resulting in huge amount of data. A key challenge is to provide an accurate phasing of very long single nucleotide polymorphism (SNP) sequences. In this paper we explore phasing of genotypes with 2 SNPs adjusted to the non-random mating model and then apply it to the haplotype inference of complete genotypes using maximum spanning trees. The runtime of the algorithm is O(nm(n + m)), where n and m are the number of genotypes and SNPs, respectively. The proposed phasing algorithm (2SNP) can be used for comparatively accurate phasing of large number of very long genome sequences. On datasets across 79 regions from HapMap[7] 2SNP is several orders of magnitude faster than GERBIL and PHASE while matching them in quality measured by the number of correctly phased genotypes, single-site and switching errors. For example, 2SNP requires 41 s on Pentium 4 2Ghz processor to phase 30 genotypes with 1381 SNPs (ENm010.7p15:2 data from HapMap) versus GERBIL and PHASE requiring more than a week of runtime and admitting no less errors than 2SNP. 2SNP software is publicly available at http://alla.cs.gsu.edu/~software/2SNP. © Springer-Verlag Berlin Heidelberg 2006.

Cite

CITATION STYLE

APA

Brinza, D., & Zelikovsky, A. (2006). Phasing of 2-SNP genotypes based on non-random mating model. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 3992 LNCS-II, pp. 767–774). Springer Verlag. https://doi.org/10.1007/11758525_103

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free