Invited talk: Ab initio gene finding engines: What is under the hood

0Citations
Citations of this article
1Readers
Mendeley users who have this article in their library.
Get full text

Abstract

I will revisit the statistical and computational foundations of ab initio gene finding algorithms that best fit current challenges in analysis of genomic data. With the number of new sequenced genomes rapidly growing, there is a need to generate high quality gene annotations in less time. In recent gene prediction competitions, the organizers described in great details the sets of experimentally confirmed eukaryotic genes that the contest participants were supposed to use for training statistical models, the key parts of ab initio gene finding algorithms. However, the gene prediction algorithm developed in our lab is only one of its kind that does not require a training set at all. It is using an unsupervised training approach and exhibits the same or better level of accuracy of gene identification as the algorithm trained on a sufficiently large training set. With more than 600 eukaryotic genome sequencing projects registered, as of February 2007, the self-learning gene finders become important tools able to accelerate extraction of biological information from newly sequenced eukaryotic genomes. Another type of challenge in gene finding is presented by metagenomic sequences which are highly fragmented, diverse in nature, and carry larger rates of sequence irregularities than it is observed in sequenced genomes of cultivated microorganisms. The issues of finding gene starts or identifying short genes in metagenomes become much more difficult than in completely sequenced prokaryotic genomes. Devising automatic gene annotation algorithms that identify specific features of gene organization in a novel genome and use adaptive strategies of self- training remains one of the open problems in machine learning. I will describe approaches to solving this problem for several classes of prokaryotic and eukaryotic genomes. © Springer-Verlag Berlin Heidelberg 2007.

Cite

CITATION STYLE

APA

Borodovsky, M. (2007). Invited talk: Ab initio gene finding engines: What is under the hood. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4463 LNBI, p. 577). Springer Verlag. https://doi.org/10.1007/978-3-540-72031-7_52

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free