Protein post-translational modification (PTM) site prediction is a fundamental task in bioinformatics. Several computational methods have been developed to predict PTM sites. However, existing methods ignore the structure information and merely utilize protein sequences. Furthermore, designing a more fine-grained structure representation learning method is urgently needed as PTM is a biological event that occurs at the atom granularity. In this paper, we propose a PTM site prediction method by Coupling of Multi-Granularity structure and Multi-Scale sequence representation, PTM-CMGMS for brevity. Specifically, multi-granularity structure-aware representation learning is designed to learn neighborhood structure representations at the amino acid, atom, and whole protein granularity from AlphaFold predicted structures, followed by utilizing contrastive learning to optimize the structure representations. Additionally, multi-scale sequence representation learning is used to extract context sequence information, and motif generated by aligning all context sequences of PTM sites assists the prediction. Extensive experiments on three datasets show that PTM-CMGMS outperforms the state-of-the-art methods. Source code can be found at https://github.com/LZYHZAU/PTM-CMGMS.
CITATION STYLE
Li, Z., Li, M., Zhu, L., & Zhang, W. (2024). Improving PTM Site Prediction by Coupling of Multi-Granularity Structure and Multi-Scale Sequence Representation. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 38, pp. 188–196). Association for the Advancement of Artificial Intelligence. https://doi.org/10.1609/aaai.v38i1.27770
Mendeley helps you to discover research relevant for your work.