Amino acid sequence comparisons to find similarities between proteins are fundamental sequence information analyses for inferring protein structure and function. In this study, we improve amino acid substitution matrices to identify distantly related proteins. We systematically sampled and benchmarked substitution matrices generated from the principal component analysis (PCA) subspace based on a set of typical existing matrices. Based on the benchmark results, we identified a region of highly sensitive matrices in the PCA subspace using kernel density estimation (KDE). Using the PCA subspace, we were able to deduce a novel sensitive matrix, called MIQS, which shows better detection performance for detecting distantly related proteins than those of existing matrices. This approach to derive an efficient amino acid substitution matrix might influence many fields of protein sequence analysis. MIQS is available at http://csas.cbrc.jp/Ssearch/.
CITATION STYLE
Tomii, K., & Yamada, K. (2016). Systematic exploration of an efficient amino acid substitution matrix: MIQS. In Methods in Molecular Biology (Vol. 1415, pp. 211–223). Humana Press Inc. https://doi.org/10.1007/978-1-4939-3572-7_11
Mendeley helps you to discover research relevant for your work.