Abstract
Motivation: Much information about new protein sequences is derived from identifying homologous proteins. Such tasks are difficult when the evolutionary relationships are distant. Some modern methods achieve better results by building a model of a set of related sequences, and then identifying new proteins that fit the model. A further advance was the development of iterative methods that refine the model as more homologs are discovered. These methods are generally limited by ad hoc methods of sequence weighting, neglect of underlying evolutionary relationships and the representation of the set with a single one-size-fits-all model. These limitations are avoided through the use of a Tree hidden Markov model (T-HMM) approach. Our previous work described how a non-iterative version of theT-HMM method could identify distant homologs with superior performance compared with other non-iterated approaches, and described how this method was particularly appropriate for being implemented as an iterative algorithm. Results: We describe an iterative version of the T-HMM algorithm, and evaluate its performance for the detection of distant homologs. Significant improvement over other commonly used methods is found. © Oxford University Press 2004; all rights reserved.
Cite
CITATION STYLE
Qian, B., & Goldstein, R. A. (2004). Performance of an iterated T-HMM for homology detection. Bioinformatics, 20(14), 2175–2180. https://doi.org/10.1093/bioinformatics/bth181
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.