Abstract
Many currently used algorithms for protein coding sequences require large learning sets of true genes to estimate sensible values for used parameters which are necessary to make the prediction reasonable. They also fail in recognition of short genes which usually contain weak coding signal. To avoid these problems, we worked out a new algorithm for finding protein coding potential in prokaryotic genomes. This algorithm uses homogeneous Markov chain for modeling nucleotide transition between fixed positions in codons thereby reduces order of Markov chain retaining simultaneously information on dependence between nucleotides in sequence on relatively long distances. We tested performance of this algorithm in relationship to size of the learning set with true and false positive rates for different model orders. We also made some comparisons between our algorithm and commonly used GeneMark. The presented algorithm works better especially for smaller learning sets.
Author supplied keywords
Cite
CITATION STYLE
Blazej, P., Mackiewicz, P., & Cebrat, S. (2011). Algorithm for finding coding signal using homogeneous Markov chains independently for three codon positions. In Proceedings of the 2011 International Conference on Bioinformatics and Computational Biology (ICBCB 2011), Haikou, China, 22-24 February, 2011 (pp. 20–24).
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.