This article is free to access.
We present a method called ngLOC, an n-gram-based Bayesian classifier that predicts the localization of a protein sequence over ten distinct subcellular organelles. A tenfold cross-validation result shows an accuracy of 89% for sequences localized to a single organelle, and 82% for those localized to multiple organelles. An enhanced version of ngLOC was developed to estimate the subcellular proteomes of eight eukaryotic organisms: yeast, nematode, fruitfly, mosquito, zebrafish, chicken, mouse, and human. © 2007 King and Guda; licensee BioMed Central Ltd.
King, B. R., & Guda, C. (2007). ngLOC: An n-gram-based Bayesian method for estimating the subcellular proteomes of eukaryotes. Genome Biology, 8(5). https://doi.org/10.1186/gb-2007-8-5-r68