Promoters are short regulatory DNA sequences located upstream of a gene. Structural analysis of promoter sequences is important for successful gene prediction. Promoters can be recognized by certain patterns that are conserved within a species, but there are many exceptions which makes the structural analysis of promoters a complex problem. Grammar rules can be used for describing the structure of promoter sequences; however, derivation of such rules is not trivial. In this paper, stochastic L-grammar rules are derived automatically from known drosophila and vertebrate promoter and non-promoter sequences using genetic programming. The fitness of grammar rules is evaluated using a machine learning technique, called Support Vector Machine (SVM). SVM is trained on the known promoter sequences to obtain a discriminating function which serves as a means of evaluating a candidate grammar (a set of rules) by determining the percentage of generated sequences that are classified correctly. The combination of SVM and grammar rule inference can mitigate the lack of structural insight in machine learning approaches such as SVM. © 2008 Springer-Verlag Berlin Heidelberg.
CITATION STYLE
Damaševičius, R. (2008). Structural analysis of promoter sequences using grammar inference and support vector machine. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5177 LNAI, pp. 98–105). Springer Verlag. https://doi.org/10.1007/978-3-540-85563-7_18
Mendeley helps you to discover research relevant for your work.