We present a small molecule pKa prediction tool entirely written in Python. It predicts the macroscopic pKa value and is trained on a literature compilation of monoprotic compounds. Different machine learning models were tested and random forest performed best given a five-fold cross-validation (mean absolute error=0.682, root mean squared error=1.032, correlation coefficient r2=0.82). We test our model on two external validation sets, where our model performs comparable to Marvin and is better than a recently published open source model. Our Python tool and all data is freely available at https://github.com/czodrowskilab/Machine-learning-meets-pKa.
CITATION STYLE
Baltruschat, M., & Czodrowski, P. (2020). Machine learning meets pKa. F1000Research, 9. https://doi.org/10.12688/f1000research.22090.1
Mendeley helps you to discover research relevant for your work.