Identifying Protein Complexes in Protein-Protein Interaction Data Using Graph Convolutional Network

15Citations
Citations of this article
6Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Protein complexes are groups of two or more polypeptide chains that bind to form noncovalent networks of protein interactions. Over the past decade, researchers have created a number of means of computing the ways in which protein complexes and their members can be identified through these interaction networks. Although most of the existing methods identify protein functional complexes from the protein-protein interaction networks (PPIs) at a fairly decent level, the applicability of advanced graph network methods has not yet been adequately investigated. This paper proposes various graph convolutional network (GCN) methods to improve the detection of protein complexes. We first formulate the protein complex detection problem as a node classification problem. Then, we developed a Neural Overlapping Community Detection (NOCD) model to cluster the nodes (proteins) using a complex affiliation matrix. A representation learning approach, that combines a multi-class GCN feature extractor (to obtain the nodes' features) and a mean shift clustering algorithm (to perform the clustering), is also utilized. We convert the dense-dense matrix operations into dense-sparse or sparse-sparse matrix operations to improve the efficiency of the multi-class GCN network by reducing space and time complexities. The proposed solution significantly improves the scalability of the existing GCN. Finally, we apply clustering aggregation to find the best protein complexes. A grid search is then performed on various detected complexes obtained via three well-known protein detection methods, namely ClusterONE, CMC, and PEWCC, with the help of the Meta-Clustering Algorithm (MCLA) and the Hybrid Bipartite Graph Formulation (HBGF). We test the proposed GCN-based methods on various publicly available datasets and find that they perform significantly better than previous state-of-the-art methods. The code/data are available for free download from https://github.com/Analystharsh/GCN_complex_detection.

Cite

CITATION STYLE

APA

Zaki, N., Singh, H., & Mohamed, E. A. (2021). Identifying Protein Complexes in Protein-Protein Interaction Data Using Graph Convolutional Network. IEEE Access, 9, 123717–123726. https://doi.org/10.1109/ACCESS.2021.3110845

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free