Abstract
Motivation: Knowledge of the correct protein subcellular localization is necessary for understanding the function of a protein. Unfortunately large-scale experimental studies are limited in their accuracy. Therefore, the development of prediction methods has been limited by the amount of accurate experimental data. However, recently large-scale experimental studies have provided new data that can be used to evaluate the accuracy of subcellular predictions in human cells. Using this data we examined the performance of state of the art methods and developed SubCons, an ensemble method that combines four predictors using a Random Forest classifier. Results: SubCons outperforms earlier methods in a dataset of proteins where two independent methods confirm the subcellular localization. Given nine subcellular localizations, SubCons achieves an F1-Score of 0.79 compared to 0.70 of the second best method. Furthermore, at a FPR of 1% the true positive rate (TPR) is over 58% for SubCons compared to less than 50% for the best individual predictor.
Cite
CITATION STYLE
Salvatore, M., Warholm, P., Shu, N., Basile, W., & Elofsson, A. (2017). SubCons: A new ensemble method for improved human subcellular localization predictions. Bioinformatics, 33(16), 2464–2470. https://doi.org/10.1093/bioinformatics/btx219
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.