Abstract
This article reports a successful application of support vector machines (SVMs) in mining high-throughput screening (HTS) data of a type I methionine aminopeptidases (MetAPs) inhibition study. A library with 43,736 small organic molecules was used in the study, and 1355 compounds in the library with 40% or higher inhibition activity were considered as active. The data set was randomly split into a training set and a test set (3:1 ratio). The authors were able to rank compounds in the test set using their decision values predicted by SVM models that were built on the training set. They defined a novel score PT 50, the percentage of the test set needed to be screened to recover 50% of the actives, to measure the performance of the models. With carefully selected parameters, SVM models increased the hit rates significantly, and 50% of the active compounds could be recovered by screening just 7% of the test set. The authors found that the size of the training set played a significant role in the performance of the models. A training set with 10,000 member compounds is likely the minimum size required to build a model with reasonable predictive power. © 2006 Society for Biomolecular Sciences.
Author supplied keywords
Cite
CITATION STYLE
Fang, J., Dong, Y., Lushington, G. H., Ye, Q. Z., & Georg, G. I. (2006). Support vector machines in HTS data mining: Type I MetAPs inhibition study. Journal of Biomolecular Screening, 11(2), 138–144. https://doi.org/10.1177/1087057105284334
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.