Prediction of cancer proteins by integrating protein interaction, domain frequency, and domain interaction data using machine learning algorithms

14Citations
Citations of this article
46Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Many proteins are known to be associated with cancer diseases. It is quite often that their precise functional role in disease pathogenesis remains unclear. A strategy to gain a better understanding of the function of these proteins is to make use of a combination of different aspects of proteomics data types. In this study, we extended Aragues's method by employing the protein-protein interaction (PPI) data, domain-domain interaction (DDI) data, weighted domain frequency score (DFS), and cancer linker degree (CLD) data to predict cancer proteins. Performances were benchmarked based on three kinds of experiments as follows: (I) using individual algorithm, (II) combining algorithms, and (III) combining the same classification types of algorithms. When compared with Aragues's method, our proposed methods, that is, machine learning algorithm and voting with the majority, are significantly superior in all seven performance measures. We demonstrated the accuracy of the proposed method on two independent datasets. The best algorithm can achieve a hit ratio of 89.4% and 72.8% for lung cancer dataset and lung cancer microarray study, respectively. It is anticipated that the current research could help understand disease mechanisms and diagnosis.

References Powered by Scopus

NCBI GEO: Archive for functional genomics data sets - Update

7345Citations
N/AReaders
Get full text

Pfam: The protein families database

4893Citations
N/AReaders
Get full text

STRING v9.1: Protein-protein interaction networks, with increased coverage and integration

3712Citations
N/AReaders
Get full text

Cited by Powered by Scopus

Drug repositioning for non-small cell lung cancer by using machine learning algorithms and topological graph theory

28Citations
N/AReaders
Get full text

Application of Genetic Programming (GP) Formalism for Building Disease Predictive Models from Protein-Protein Interactions (PPI) Data

16Citations
N/AReaders
Get full text

Machine Learning Analysis of RNA-seq Data for Diagnostic and Prognostic Prediction of Colon Cancer

16Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Huang, C. H., Peng, H. S., & Ng, K. L. (2015). Prediction of cancer proteins by integrating protein interaction, domain frequency, and domain interaction data using machine learning algorithms. BioMed Research International, 2015. https://doi.org/10.1155/2015/312047

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 19

66%

Researcher 8

28%

Professor / Associate Prof. 2

7%

Readers' Discipline

Tooltip

Computer Science 9

36%

Biochemistry, Genetics and Molecular Bi... 8

32%

Medicine and Dentistry 5

20%

Social Sciences 3

12%

Save time finding and organizing research with Mendeley

Sign up for free