Abstract
The falsely annotated protein-coding genes have been deemed one of the major causes accounting for the annotating errors in public databases. Although many filtering approaches have been designed for the over-annotated protein-coding genes, some are questionable due to the resultant increase in false negative. Furthermore, there is no webserver or software specifically devised for the problem of over-annotation. In this study, we propose an integrative algorithm for detecting the over-annotated protein-coding genes in microorganisms. Overall, an average accuracy of 99.94 is achieved over 61 microbial genomes. The extremely high accuracy indicates that the presented algorithm is efficient to differentiate the protein-coding genes from the non-coding open reading frames. Abundant analyses show that the predicting results are reliable and the integrative algorithm is robust and convenient. Our analysis also indicates that the over-annotated protein-coding genes can cause the false positive of horizontal gene transfers detection. The webserver of the proposed algorithm can be freely accessible from www.cbi.seu.edu.cn/RPGM. © 2011 The Author.
Author supplied keywords
Cite
CITATION STYLE
Yu, J. F., Xiao, K., Jiang, D. K., Guo, J., Wang, J. H., & Sun, X. (2011). An integrative method for identifying the over-annotated protein-coding genes in microbial genomes. DNA Research, 18(6), 435–449. https://doi.org/10.1093/dnares/dsr030
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.