The importance of reliable methods for representative sub-sampling in terms of experimental design and risk assessment within the European Registration, Evaluation, Authorisation and Restriction of Chemicals (REACH) system is crucial. We developed experimental design approaches, by utilising predicted properties and the 'distance to model' parameter, to estimate the benefits of certain compounds to the quality of a resulting model. A statistical evaluation of four regression data sets and one classification data set showed that the adaptive concept of iteratively refining the representation of the chemical space contributes to a more efficient and more reliable selection in comparison to traditional approaches. The evaluation of compounds with regard to the uncertainty and the correlation of prediction is beneficial, and in particular, for regression data sets of sufficient size, whereas the use of predicted properties to define the chemical space is beneficial for classification models.
CITATION STYLE
Brandmaier, S., Novotarskyi, S., Sushko, I., & Tetko, I. V. (2013). From descriptors to predicted properties: Experimental design by using applicability domain estimation. Alternatives to Laboratory Animals, 41(1), 33–47. https://doi.org/10.1177/026119291304100106
Mendeley helps you to discover research relevant for your work.