An important challenge in modern functional proteomics is the prediction of the functional behavior of proteins. Motifs in protein chains can make such a prediction possible. The correlation between protein properties and their motifs is not always obvious, since more than one motifs can exist within a protein chain. Thus, the behavior of a protein is a function of many motifs, where some overpower others. In this paper a data-mining approach for motif-based classification of proteins is presented. A new classification rules inducing algorithm that exploits finite state automata is introduced. First, data are modeled by terms of prefix tree acceptors, which are later merged into finite state automata. Finally, we propose a new algorithm for the induction of protein classification rules from finite state automata. The data-mining model is trained and tested using various protein and protein class subsets, as well as the whole dataset of known proteins and protein classes. Results indicate the efficiency of our technique compared to other known data-mining algorithms.
CITATION STYLE
Psomopoulos, F. E., Diplaris, S., & Mitkas, P. A. (2004). A finite state automata based technique for protein classification rules induction. In Proceedings of the Second European Workshop on Data Mining and Text Mining in Bioinformatics (in conjunction with ECML/PKDD) (pp. 54–60). Pisa, Italy.
Mendeley helps you to discover research relevant for your work.