BLAST and FASTA similarity searching for multiple sequence alignment

42Citations
Citations of this article
87Readers
Mendeley users who have this article in their library.
Get full text

Abstract

BLAST, FASTA, and other similarity searching programs seek to identify homologous proteins and DNA sequences based on excess sequence similarity. If two sequences share much more similarity than expected by chance, the simplest explanation for the excess similarity is common ancestry-homology. The most effective similarity searches compare protein sequences, rather than DNA sequences, for sequences that encode proteins, and use expectation values, rather than percent identity, to infer homology. The BLAST and FASTA packages of sequence comparison programs provide programs for comparing protein and DNA sequences to protein databases (the most sensitive searches). Protein and translated-DNA comparisons to protein databases routinely allow evolutionary look back times from 1 to 2 billion years; DNA:DNA searches are 5-10-fold less sensitive. BLAST and FASTA can be run on popular web sites, but can also be downloaded and installed on local computers. With local installation, target databases can be customized for the sequence data being characterized. With today's very large protein databases, search sensitivity can also be improved by searching smaller comprehensive databases, for example, a complete protein set from an evolutionarily neighboring model organism. By default, BLAST and FASTA use scoring strategies target for distant evolutionary relationships; for comparisons involving short domains or queries, or searches that seek relatively close homologs (e.g. mouse-human), shallower scoring matrices will be more effective. Both BLAST and FASTA provide very accurate statistical estimates, which can be used to reliably identify protein sequences that diverged more than 2 billion years ago. © 2014 Springer Science+Business Media, LLC.

Cite

CITATION STYLE

APA

Pearson, W. R. (2014). BLAST and FASTA similarity searching for multiple sequence alignment. Methods in Molecular Biology, 1079, 75–101. https://doi.org/10.1007/978-1-62703-646-7_5

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free