QSAR with experimental and predictive distributions: An information theoretic approach for assessing model quality

David J. Wood; Lars Carlsson; Martin Eklund; Ulf Norinder; Jonna Stålring

Journal ArticleOPEN ACCESS

QSAR with experimental and predictive distributions: An information theoretic approach for assessing model quality

Journal of Computer-Aided Molecular Design (2013) 27(3) 203-219

DOI: 10.1007/s10822-013-9639-5

32Citations

51Readers

Abstract

We propose that quantitative structure-activity relationship (QSAR) predictions should be explicitly represented as predictive (probability) distributions. If both predictions and experimental measurements are treated as probability distributions, the quality of a set of predictive distributions output by a model can be assessed with Kullback-Leibler (KL) divergence: a widely used information theoretic measure of the distance between two probability distributions. We have assessed a range of different machine learning algorithms and error estimation methods for producing predictive distributions with an analysis against three of AstraZeneca's global DMPK datasets. Using the KL-divergence framework, we have identified a few combinations of algorithms that produce accurate and valid compound-specific predictive distributions. These methods use reliability indices to assign predictive distributions to the predictions output by QSAR models so that reliable predictions have tight distributions and vice versa. Finally we show how valid predictive distributions can be used to estimate the probability that a test compound has properties that hit single- or multi- objective target profiles. © 2013 The Author(s).

Author supplied keywords

Cite

CITATION STYLE

APA

Wood, D. J., Carlsson, L., Eklund, M., Norinder, U., & Stålring, J. (2013). QSAR with experimental and predictive distributions: An information theoretic approach for assessing model quality. Journal of Computer-Aided Molecular Design, 27(3), 203–219. https://doi.org/10.1007/s10822-013-9639-5

QSAR with experimental and predictive distributions: An information theoretic approach for assessing model quality

Abstract

Author supplied keywords

Cite

Register to see more suggestions