Escherichia Coli DNa N-4-Methycytosine Site Prediction accuracy Improved by Light Gradient Boosting Machine Feature Selection Technology

48Citations
Citations of this article
15Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Recently, several machine-learning-based DNa N-4-methycytosine (4mC) predictors have been developed to provide deeper insight into the biological functions and mechanisms of 4mC. However, the performance of the existing classifiers for identification of Escherichia coli DNa 4mC sites is inadequate. Here, we present a new support vector machine 4mC predictor, named iEC4mC-SVM, for Escherichia coli (E.coli) DNa 4mC site identification, optimized using light gradient boosting machine feature selection technology. The iEC4mC-SVM predictor had a 10-fold cross-validation accuracy of 85.4% and Jackknife cross-validation accuracy of 84.9%. The 83.2% independent testing accuracy of iEC4mC-SVM was 1.0-6.5% higher than those of state-of-the-art E. coli DNa 4mC site predictors. a t-distributed stochastic neighbor embedding analysis confirmed that the prediction performance enhancement of iEC4mC-SVM was due to the light gradient boosting machine feature selection.

Cite

CITATION STYLE

APA

Lv, Z., Wang, D., Ding, H., Zhong, B., & Xu, L. (2020). Escherichia Coli DNa N-4-Methycytosine Site Prediction accuracy Improved by Light Gradient Boosting Machine Feature Selection Technology. IEEE Access, 8, 14851–14859. https://doi.org/10.1109/aCCESS.2020.2966576

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free