In the postgenome era, biologists have sought to measure the complete complement of proteins, termed proteomics. Currently, the most effective method to measure the proteome is with shotgun, or bottom-up, proteomics, in which the proteome is digested into peptides that are identified followed by protein inference. Despite continuous improvements to all steps of the shotgun proteomics workflow, observed proteome coverage is often low; some proteins are identified by a single peptide sequence. Complete proteome sequence coverage would allow comprehensive characterization of RNA splicing variants and all posttranslational modifications, which would drastically improve the accuracy of biological models. There are many reasons for the sequence coverage deficit, but ultimately peptide length determines sequence observability. Peptides that are too short are lost because they match many protein sequences and their true origin is ambiguous. The maximum observable peptide length is determined by several analytical challenges. This paper explores computationally how peptide lengths produced from several common proteome digestion methods limit observable proteome coverage. Iterative proteome cleavage strategies are also explored. These simulations reveal that maximized proteome coverage can be achieved by use of an iterative digestion protocol involving multiple proteases and chemical cleavages that theoretically allow 92.9% proteome coverage.
Meyer, J. G. (2014). In Silico Proteome Cleavage Reveals Iterative Digestion Strategy for High Sequence Coverage . ISRN Computational Biology, 2014, 1–7. https://doi.org/10.1155/2014/960902