A random forest classifier for protein-protein docking models

9Citations
Citations of this article
12Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Herein, we present the results of a machine learning approach we developed to single out correct 3D docking models of protein-protein complexes obtained by popular docking software. To this aim, we generated 3×104 docking models for each of the 230 complexes in the protein-protein benchmark, version 5, using three different docking programs (HADDOCK, FTDock and ZDOCK), for a cumulative set of ≈7×106 docking models. Three different machine learning approaches (Random Forest, Supported Vector Machine and Perceptron) were used to train classifiers with 158 different scoring functions (features). The Random Forest algorithm outperformed the other two algorithms and was selected for further optimization. Using a features selection algorithm, and optimizing the random forest hyperparameters, allowed us to train and validate a random forest classifier, named COnservation Driven Expert System (CoDES). Testing of CoDES on independent datasets, as well as results of its comparative performance with machine learning methods recently developed in the field for the scoring of docking decoys, confirm its state-of-the-art ability to discriminate correct from incorrect decoys both in terms of global parameters and in terms of decoys ranked at the top positions.

References Powered by Scopus

Matplotlib: A 2D graphics environment

24991Citations
N/AReaders
Get full text

Selection of relevant features and examples in machine learning

2589Citations
N/AReaders
Get full text

HADDOCK: A protein-protein docking approach based on biochemical or biophysical information

2546Citations
N/AReaders
Get full text

Cited by Powered by Scopus

Discriminating physiological from non-physiological interfaces in structures of protein complexes: A community-wide study

9Citations
N/AReaders
Get full text

Exploration of m<sup>6</sup>A methylation regulators as epigenetic targets for immunotherapy in advanced sepsis

8Citations
N/AReaders
Get full text

Anticancer Peptides Derived from Aldolase A and Induced Tumor-Suppressing Cells Inhibit Pancreatic Ductal Adenocarcinoma Cells

4Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Barradas-Bautista, D., Cao, Z., Vangone, A., Oliva, R., & Cavallo, L. (2022). A random forest classifier for protein-protein docking models. Bioinformatics Advances, 2(1). https://doi.org/10.1093/bioadv/vbab042

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 3

75%

Researcher 1

25%

Readers' Discipline

Tooltip

Agricultural and Biological Sciences 2

50%

Design 1

25%

Chemistry 1

25%

Article Metrics

Tooltip
Mentions
News Mentions: 1

Save time finding and organizing research with Mendeley

Sign up for free