Abstract
Motivation: Despite arduous and time-consuming experimental efforts, protein-protein interactions (PPIs) for many pathogenic microbes with their human host are still unknown, limiting our understanding of the intricate interactions during infection and the identification of therapeutic targets. Since computational tools offer a promising alternative, we developed an R/Bioconductor package, HPiP (Host-Pathogen Interaction Prediction) software with a series of amino acid sequence property descriptors and an ensemble machine learning classifiers to predict the yet unmapped interactions between pathogen and host proteins. Results: Using severe acute respiratory syndrome coronavirus 1 (SARS-CoV-1) or the novel SARS-CoV-2 coronavirus-human PPI training sets as a case study, we show that HPiP achieves a good performance with PPI predictions between SARS-CoV-2 and human proteins, which we confirmed experimentally in human monocyte THP-1 cells, and with several quality control metrics. HPiP also exhibited strong performance in accurately predicting the previously reported PPIs when tested against the sequences of pathogenic bacteria, Mycobacterium tuberculosis and human proteins. Collectively, our fully documented HPiP software will hasten the exploration of PPIs for a systems-level understanding of many understudied pathogens and uncover molecular targets for repurposing existing drugs.
Cite
CITATION STYLE
Rahmatbakhsh, M., Moutaoufik, M. T., Gagarinova, A., & Babu, M. (2022). HPiP: an R/Bioconductor package for predicting host-pathogen protein-protein interactions from protein sequences using ensemble machine learning approach. Bioinformatics Advances, 2(1). https://doi.org/10.1093/bioadv/vbac038
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.