Improving the specificity of exon prediction using comparative genomics

Jing Wu

Conference ProceedingsOPEN ACCESS

Improving the specificity of exon prediction using comparative genomics

Wu J

BMC Genomics (2008) 9(SUPPL. 2)

DOI: 10.1186/1471-2164-9-S2-S13

5Citations

13Readers

Abstract

Background: Computational gene prediction tools routinely generate large volumes of predicted coding exons (putative exons). One common limitation of these tools is the relatively low specificity due to the large amount of non-coding regions. Methods: A statistical approach is developed that largely improves the gene prediction specificity. The key idea is to utilize the evolutionary conservation principle relative to the coding exons. By first exploiting the homology between genomes of two related species, a probability model for the evolutionary conservation pattern of codons across different genomes is developed. A probability model for the dependency between adjacent codons/triplets is added to differentiate coding exons and random sequences. Finally, the log odds ratio is developed to classify putative exons into the group of coding exons and the group of non-coding regions. Results: The method was tested on pre-aligned human-mouse sequences where the putative exons are predicted by GENSCAN and TWINSCAN. The proposed method is able to improve the exon specificity by 73% and 32% respectively, while the loss of the sensitivity ≤ 1%. The method also keeps 98% of RefSeq gene structures that are correctly predicted by TWINSCAN when removing 26% of predicted genes that are in non-coding regions. The estimated number of true exons in TWINSCAN's predictions is 157,070. The results and the executable codes can be downloaded from http://www.stat.purdue.edu/~jingwu/codon/. Conclusion: The proposed method demonstrates an application of the evolutionary conservation principle to coding exons. It is a complementary method which can be used as an additional criteria to refine many existing gene predictions. © 2008 Wu; licensee BioMed Central Ltd.

References Powered by Scopus

View more at Scopus

Cited by Powered by Scopus

View more at Scopus

Cite

CITATION STYLE

APA

Wu, J. (2008). Improving the specificity of exon prediction using comparative genomics. In BMC Genomics (Vol. 9). https://doi.org/10.1186/1471-2164-9-S2-S13

Readers' Seniority

Researcher 5

42%

Professor / Associate Prof. 4

33%

PhD / Post grad / Masters / Doc 3

25%

Readers' Discipline

Agricultural and Biological Sciences 7

58%

Biochemistry, Genetics and Molecular Bi... 4

33%

Engineering 1

Improving the specificity of exon prediction using comparative genomics

Abstract

References Powered by Scopus

Initial sequencing and comparative analysis of the mouse genome

Prediction of complete gene structures in human genomic DNA

NCBI Reference Sequence (RefSeq): A curated non-redundant sequence database of genomes, transcripts and proteins

Cited by Powered by Scopus

Genomics, molecular imaging, bioinformatics, and bio-nano-info integration are synergistic components of translational medicine and personalized healthcare research

A comprehensive review of emerging computational methods for gene identification

Promoting inter/multidisciplinary education and research in bioinformatics, systems biology and intelligent computing

Register to see more suggestions

Cite

Readers' Seniority

Readers' Discipline