Sequence-based heuristics for faster annotation of non-coding RNA families

Zasha Weinberg; Walter L. Ruzzo

Journal ArticleOPEN ACCESS

Sequence-based heuristics for faster annotation of non-coding RNA families

Bioinformatics (2006) 22(1) 35-39

DOI: 10.1093/bioinformatics/bti743

74Citations

54Readers

Abstract

Motivation: Non-coding RNAs (ncRNAs) are functional RNA molecules that do not code for proteins. Covariance Models (CMs) are a useful statistical tool to find new members of an ncRNA gene family in a large genome database, using both sequence and, importantly, RNA secondary structure information. Unfortunately, CM searches are extremely slow. Previously, we created rigorous filters, which provably sacrifice none of a CM's accuracy, while making searches significantly faster for virtually all ncRNA families. However, these rigorous filters make searches slower than heuristics could be. Results: In this paper we introduce profile HMM-based heuristic filters. We show that their accuracy is usually superior to heuristics based on BLAST. Moreover, we compared our heuristics with those used in tRNAscan-SE, whose heuristics incorporate a significant amount of work specific to tRNAs, where our heuristics are generic to any ncRNA. Performance was roughly comparable, so we expect that our heuristics provide a high-quality solution that - unlike family-specific solutions - can scale to hundreds of ncRNA families. © The Author 2005. Published by Oxford University Press. All rights reserved.

Cite

CITATION STYLE

APA

Weinberg, Z., & Ruzzo, W. L. (2006). Sequence-based heuristics for faster annotation of non-coding RNA families. Bioinformatics, 22(1), 35–39. https://doi.org/10.1093/bioinformatics/bti743

Sequence-based heuristics for faster annotation of non-coding RNA families

Abstract

Cite

Register to see more suggestions