Prediction of protein subcellular locations by support vector machines using compositions of amino acids and amino pairs

345Citations
Citations of this article
69Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Motivation: The subcellular location of a protein is closely correlated to its function. Thus, computational prediction of subcellular locations from the amino acid sequence information would help annotation and functional prediction of protein coding genes in complete genomes. We have developed a method based on support vector machines (SVMs). Results: We considered 12 subcellular locations in eukaryotic cells: chloroplast, cytoplasm, cytoskeleton, endoplasmic reticulum, extracellular medium, Golgi apparatus, lysosome, mitochondrion, nucleus, peroxisome, plasma membrane, and vacuole. We constructed a data set of proteins with known locations from the SWISS-PROT database. A set of SVMs was trained to predict the subcellular location of a given protein based on its amino acid, amino acid pair, and gapped amino acid pair compositions. The predictors based on these different compositions were then combined using a voting scheme. Results obtained through 5-fold cross-validation tests showed an improvement in prediction accuracy over the algorithm based on the amino acid composition only. This prediction method is available via the Internet.

Cite

CITATION STYLE

APA

Park, K. J., & Kanehisa, M. (2003). Prediction of protein subcellular locations by support vector machines using compositions of amino acids and amino pairs. Bioinformatics, 19(13), 1656–1663. https://doi.org/10.1093/bioinformatics/btg222

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free