Motivation: The number of Single Nucleotide Polymorphisms (SNPs) detectable in an alignment is a function of the length and the number of the aligned sequences. The latter is called sample size. However, a typical alignment, for instance obtained as a BLAST-search result of a query sequence against an EST database, does not evenly cover the query sequence. Therefore, it is usually not clear what the actual sample size is. Results: We present a method to calculate the effective sample size, called neff, for a given BLAST alignment. This method takes into account that multiple coverage contributes only logarithmically to the SNP yield of a given sequence stretch. We show that the effective sample size neff is usually much smaller than would be expected for a given amount of coverage and illustrate this with two typical examples.
CITATION STYLE
Haubold, B., & Wiehe, T. (2002). Calculating the SNP-effectie sample size from an alignment. Bioinformatics, 18(1), 36–38. https://doi.org/10.1093/bioinformatics/18.1.36
Mendeley helps you to discover research relevant for your work.