In the field of biological data mining, protein sequence classification is a major research area. To classify unknown protein sequences into proper class, sub-class and family, different features are extracted from the protein sequences, which are applied on any popular soft computing methodology. The most popular features used for protein sequence classification are average molecular weight and iso-electric point value, which are applied to Fuzzy ARTMAP model. Unfortunately, these two features have some weakness. This weakness may decrease the accuracy level of Fuzzy ARTMAP model. This work points out the weakness and also proposes a new computational approach involving position value of individual amino acid in a protein sequence to calculate positional-average molecular weight and iso-electric point. Hence, this new proposed computational technique is applied to 497 protein sequences in six different classes. Finally, a comparative study with statistical analysis has been performed between positional-average values of features and average values of features. This work proves that positional-average values of features play a significant role in classification than the average values of features to increase accuracy level of classification.
CITATION STYLE
Saha, S., & Bhattacharya, T. (2020). A New Protein Sequence Classification Approach Using Positional-Average Values of Features. In Advances in Intelligent Systems and Computing (Vol. 1053, pp. 703–712). Springer. https://doi.org/10.1007/978-981-15-0751-9_65
Mendeley helps you to discover research relevant for your work.