A graph theoretic approach to utilizing protein structure to identify non-random somatic mutations

23Citations
Citations of this article
76Readers
Mendeley users who have this article in their library.

Abstract

Background: It is well known that the development of cancer is caused by the accumulation of somatic mutations within the genome. For oncogenes specifically, current research suggests that there is a small set of "driver" mutations that are primarily responsible for tumorigenesis. Further, due to recent pharmacological successes in treating these driver mutations and their resulting tumors, a variety of approaches have been developed to identify potential driver mutations using methods such as machine learning and mutational clustering. We propose a novel methodology that increases our power to identify mutational clusters by taking into account protein tertiary structure via a graph theoretical approach.Results: We have designed and implemented GraphPAC (Graph Protein Amino acid Clustering) to identify mutational clustering while considering protein spatial structure. Using GraphPAC, we are able to detect novel clusters in proteins that are known to exhibit mutation clustering as well as identify clusters in proteins without evidence of prior clustering based on current methods. Specifically, by utilizing the spatial information available in the Protein Data Bank (PDB) along with the mutational data in the Catalogue of Somatic Mutations in Cancer (COSMIC), GraphPAC identifies new mutational clusters in well known oncogenes such as EGFR and KRAS. Further, by utilizing graph theory to account for the tertiary structure, GraphPAC discovers clusters in DPP4, NRP1 and other proteins not identified by existing methods. The R package is available at: http://bioconductor.org/packages/release/bioc/html/GraphPAC.html.Conclusion: GraphPAC provides an alternative to iPAC and an extension to current methodology when identifying potential activating driver mutations by utilizing a graph theoretic approach when considering protein tertiary structure. © 2014 Ryslik et al.; licensee BioMed Central Ltd.

References Powered by Scopus

Random forests

94857Citations
N/AReaders
Get full text

Support-Vector Networks

45791Citations
N/AReaders
Get full text

The Protein Data Bank

32039Citations
N/AReaders
Get full text

Cited by Powered by Scopus

AGL-Score: Algebraic Graph Learning Score for Protein-Ligand Binding Scoring, Ranking, Docking, and Screening

166Citations
N/AReaders
Get full text

NAPS: Network analysis of protein structures

140Citations
N/AReaders
Get full text

mutation3D: Cancer Gene Prediction Through Atomic Clustering of Coding Variants in the Structural Proteome

81Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Ryslik, G. A., Cheng, Y., Cheung, K. H., Modis, Y., & Zhao, H. (2014). A graph theoretic approach to utilizing protein structure to identify non-random somatic mutations. BMC Bioinformatics, 15(1). https://doi.org/10.1186/1471-2105-15-86

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 41

76%

Professor / Associate Prof. 6

11%

Researcher 6

11%

Lecturer / Post doc 1

2%

Readers' Discipline

Tooltip

Agricultural and Biological Sciences 22

46%

Computer Science 12

25%

Biochemistry, Genetics and Molecular Bi... 9

19%

Engineering 5

10%

Article Metrics

Tooltip
Social Media
Shares, Likes & Comments: 921

Save time finding and organizing research with Mendeley

Sign up for free