A screening method for Z-value assessment based on the normalized edit distance

3Citations
Citations of this article
1Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Pairwise global alignment scores are used to detect related sequences in genome and proteins. These scores are biased by the length and composition of the compared sequences, and the Z-value is used to estimate their statistical significance. The Z-value is computed using a Monte Carlo algorithm that requires a large number of pairwise alignments between random permutations of the sequences compared. A different alignment score, the normalized edit distance, is independent of the sequence lengths, and it usually takes 2 or 3 standard alignment calculations. In this paper we study the relationship between the normalized edit distance and the Z-value, and propose a method to screen pairs of unrelated sequences, so that Z-value needs to be computed for a small percentage of sequence pairs. We apply this method to the comparison of proteins from Saccharomyces cerevisiae, Escherichia coli, Methanococcus jannaschii and Haemophilus influenzae, showing that Z-value has to be computed for less than 1% of all protein pairs. © 2009 Springer Berlin Heidelberg.

Cite

CITATION STYLE

APA

Peris, G., & Marzal, A. (2009). A screening method for Z-value assessment based on the normalized edit distance. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5518 LNCS, pp. 1154–1161). https://doi.org/10.1007/978-3-642-02481-8_175

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free