Sequence-Based Prediction of Cysteine Reactivity Using Machine Learning

24Citations
Citations of this article
82Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

As one of the most intrinsically reactive amino acids, cysteine carries a variety of important biochemical functions, including catalysis and redox regulation. Discovery and characterization of cysteines with heightened reactivity will help annotate protein functions. Chemical proteomic methods have been used to quantitatively profile cysteine reactivity in native proteomes, showing a strong correlation between the chemical reactivity of a cysteine and its functionality; however, the relationship between the cysteine reactivity and its local sequence has not yet been systematically explored. Herein, we report a machine learning method, sbPCR (sequence-based prediction of cysteine reactivity), which combines the basic local alignment search tool, truncated composition of k-spaced amino acid pair analysis, and support vector machine to predict cysteines with hyper-reactivity based on only local sequence features. Using a benchmark set compiled from hyper-reactive cysteines in human proteomes, our method can achieve a prediction accuracy of 98%, a precision of 95%, and a recall ratio of 89%. We utilized these governing features of local sequence motifs to expand the prediction to potential hyper-reactive cysteines in other proteomes deposited in the UniProt database. We validated our predictions in Escherichia coli by activity-based protein profiling and discovered a hyper-reactive cysteine from a functionally uncharacterized protein, YecH. Biochemical analysis suggests that the hyper-reactive cysteine might be involved in metal binding. Our computational method provides a large inventory of potential hyper-reactive cysteines in proteomes and is highly complementary to other experimental approaches to guide systematic annotation of protein functions in the postgenome era.

Cite

CITATION STYLE

APA

Wang, H., Chen, X., Li, C., Liu, Y., Yang, F., & Wang, C. (2018). Sequence-Based Prediction of Cysteine Reactivity Using Machine Learning. Biochemistry, 57(4), 451–460. https://doi.org/10.1021/acs.biochem.7b00897

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free