SIPF: Sampling Method for Inverse Protein Folding

0Citations
Citations of this article
5Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Protein engineering has important applications in drug discovery. Among others, inverse protein folding is a fundamental task in protein design, which aims at generating protein's amino acid sequence given a 3D graph structure. However, most existing methods for inverse protein folding are based on sequential generative models and therefore limited in uncertainty quantification and exploration ability to the entire protein space. To address the issues, we propose a sampling method for inverse protein folding (SIPF). Specifically, we formulate inverse protein folding as a sampling problem and design two pretrained neural networks as Markov Chain Monte Carlo (MCMC) proposal distribution. To ensure sampling efficiency, we further design (i) an adaptive sampling scheme to select variables for sampling and (ii) an approximate target distribution as a surrogate of the unavailable target distribution. Empirical studies have been conducted to validate the effectiveness of SIPF, achieving 7.4% relative improvement on recovery rate and 6.4% relative reduction in perplexity compared to the best baseline.

Cite

CITATION STYLE

APA

Fu, T., & Sun, J. (2022). SIPF: Sampling Method for Inverse Protein Folding. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 378–388). Association for Computing Machinery. https://doi.org/10.1145/3534678.3539284

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free