As a most important task in protein sequence analysis, protein remote homology detection has been extensively studied for decades. Currently, the profile-based methods show the state-of-the-art performance. Position-Specific Frequency Matrix (PSFM) is a widely used profile. The reason is that this profile contains evolutionary information, which is critical for protein sequence analysis. However, there exists noise information in the profiles introduced by the amino acids with low frequencies, which are not likely to occur in the corresponding sequence positions during evolutionary process. In this study, we propose one method to remove the noise information in the PSFM by removing the amino acids with low frequencies and two a profile can be generated, called Top frequency profile (TFP). Autocross covariance (ACC) transformation is performed on the profile to convert them into fixed length feature vectors. Combined with Support Vector Machines (SVMs), the predictor is constructed. Evaluated on a benchmark dataset, experimental results show that the proposed method outperforms other state-of-the-art predictors for protein remote homology detection, indicating that the proposed method is useful tools for protein sequence analysis. Because the profiles generated from multiple sequence alignments are important for protein structure and function prediction, the TFP will has many potential applications.
CITATION STYLE
Liao, Q., Guo, M., & Liu, B. (2019). Protein Remote Homology Detection Based on Profiles. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11465 LNBI, pp. 261–268). Springer Verlag. https://doi.org/10.1007/978-3-030-17938-0_24
Mendeley helps you to discover research relevant for your work.