Protein–protein interaction is a biological process, which plays a vital role in the functioning of the metabolic process inside the organism. More than 80% of protein does not perform function alone but performs in combination. Some non-identified protein can be identified with their interaction with a protein whose function is already known. Protein–protein interactions (PPI) and Protein–protein non-interactions (PPNI) display different levels of growth rate, and the number of PPI is significantly greater than that of PPNI. This significant difference in the number of PPI and PPNI increases the cost of constructing a balanced data set. In this paper, the effect of various discretization techniques including Ameva, Class-Attribute Inter-Dependence Maximization (CAIM), Chi-merge, and Fu sinter is investigated with different classification techniques. The CAIM Discretization with SVM has a significant impact on the result as compared to normal SVM using 10-fold cross-validation. Experiments are performed on E. coli and H. Sapiens protein datasets, and we achieved excellent results with accuracies 92.8% and 93.8% on average in CAIM Discretization using SVM classifier, with AUC values of 80.7% and 82.1% respectively.
CITATION STYLE
Sisodia, D. S., & Singh, M. (2019). An empirical investigation of discretization techniques on the classification of protein–Protein interaction. In Advances in Intelligent Systems and Computing (Vol. 748, pp. 509–521). Springer Verlag. https://doi.org/10.1007/978-981-13-0923-6_44
Mendeley helps you to discover research relevant for your work.