A confidence predictor for logD using conformal regression and a support-vector machine

  • Lapins M
  • Arvidsson S
  • Lampa S
 et al. 
  • 13

    Readers

    Mendeley users who have this article in their library.
  • 0

    Citations

    Citations of this article.

Abstract

Lipophilicity is a major determinant of ADMET properties and overall suitability of drug candidates. We have developed large-scale models to predict water–octanol distribution coefficient (logD) for chemical compounds, aiding drug discovery projects. Using ACD/logD data for 1.6 million compounds from the ChEMBL database, models are created and evaluated by a support-vector machine with a linear kernel using conformal prediction methodology, outputting prediction intervals at a specified confidence level. The resulting model shows a predictive ability of $$\hbox {Q}^{2}=0.973$$ Q 2 = 0.973 and with the best performing nonconformity measure having median prediction interval of $$\pm ~0.39$$ ± 0.39 log units at 80% confidence and $$\pm ~0.60$$ ± 0.60 log units at 90% confidence. The model is available as an online service via an OpenAPI interface, a web page with a molecular editor, and we also publish predictive values at 90% confidence level for 91 M PubChem structures in RDF format for download and as an URI resolver service.

Author-supplied keywords

  • Conformal prediction
  • LogD
  • Machine learning
  • QSAR
  • RDF
  • Support-vector machine

Get free article suggestions today

Mendeley saves you time finding and organizing research

Sign up here
Already have an account ?Sign in

Find this document

Authors

Cite this document

Choose a citation style from the tabs below

Save time finding and organizing research with Mendeley

Sign up for free