Protein subcellular localization prediction of eukaryotes using a knowledge-based approach

28Citations
Citations of this article
50Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Background: The study of protein subcellular localization (PSL) is important for elucidating protein functions involved in various cellular processes. However, determining the localization sites of a protein through wet-lab experiments can be time-consuming and labor-intensive. Thus, computational approaches become highly desirable. Most of the PSL prediction systems are established for single-localized proteins. However, a significant number of eukaryotic proteins are known to be localized into multiple subcellular organelles. Many studies have shown that proteins may simultaneously locate or move between different cellular compartments and be involved in different biological processes with different roles. Results: In this study, we propose a knowledge based method, called KnowPredsite, to predict the localization site(s) of both single-localized and multi-localized proteins. Based on the local similarity, we can identify the "related sequences" for prediction. We construct a knowledge base to record the possible sequence variations for protein sequences. When predicting the localization annotation of a query protein, we search against the knowledge base and used a scoring mechanism to determine the predicted sites. We downloaded the dataset from ngLOC, which consisted of ten distinct subcellular organelles from 1923 species, and performed ten-fold cross validation experiments to evaluate KnowPredsite's performance. The experiment results show that KnowPredsiteachieves higher prediction accuracy than ngLOC and Blast-hit method. For single-localized proteins, the overall accuracy of KnowPredsiteis 91.7%. For multi-localized proteins, the overall accuracy of KnowPredsiteis 72.1%, which is significantly higher than that of ngLOC by 12.4%. Notably, half of the proteins in the dataset that cannot find any Blast hit sequence above a specified threshold can still be correctly predicted by KnowPredsite. Conclusion: KnowPredsitedemonstrates the power of identifying related sequences in the knowledge base. The experiment results show that even though the sequence similarity is low, the local similarity is effective for prediction. Experiment results show that KnowPredsiteis a highly accurate prediction method for both single- and multi-localized proteins. It is worth-mentioning the prediction process of KnowPredsiteis transparent and biologically interpretable and it shows a set of template sequences to generate the prediction result. The KnowPredsiteprediction server is available at http://bio-cluster.iis.sinica.edu.tw/kbloc/. © 2009 Lin et al; licensee BioMed Central Ltd.

Cite

CITATION STYLE

APA

Lin, H. N., Chen, C. T., Sung, T. Y., Ho, S. Y., & Hsu, W. L. (2009). Protein subcellular localization prediction of eukaryotes using a knowledge-based approach. BMC Bioinformatics, 10(SUPPL. 15). https://doi.org/10.1186/1471-2105-10-S15-S8

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free