Quantifying overfitting potential in drug binding datasets

Brian Davis; Kevin Mcloughlin; Jonathan Allen; Sally R. Ellingson

Conference ProceedingsOPEN ACCESS

Quantifying overfitting potential in drug binding datasets

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2020) 12139 LNCS 585-598

DOI: 10.1007/978-3-030-50420-5_44

1Citations

5Readers

Abstract

In this paper, we investigate potential biases in datasets used to make drug binding predictions using machine learning. We investigate a recently published metric called the Asymmetric Validation Embedding (AVE) bias which is used to quantify this bias and detect overfitting. We compare it to a slightly revised version and introduce a new weighted metric. We find that the new metrics allow to quantify overfitting while not overly limiting training data and produce models with greater predictive value.

Author supplied keywords

Cite

CITATION STYLE

APA

Davis, B., Mcloughlin, K., Allen, J., & Ellingson, S. R. (2020). Quantifying overfitting potential in drug binding datasets. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12139 LNCS, pp. 585–598). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-50420-5_44

Quantifying overfitting potential in drug binding datasets

Abstract

Author supplied keywords

Cite

Register to see more suggestions