Generating peptide candidates from amino-acid sequence databases for protein identification via mass spectrometry

Nathan Edwards; Ross Lippert

Conference Proceedings

Generating peptide candidates from amino-acid sequence databases for protein identification via mass spectrometry

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2002) 2452 68-81

DOI: 10.1007/3-540-45784-4_6

17Citations

21Readers

Get full text

Abstract

Protein identification via mass spectrometry forms the foundation of high-throughput proteomics. Tandem mass spectrometry, when applied to a complex mixture of peptides, selects and fragments each peptide to reveal its amino-acid sequence structure. The successful analysis of such an experiment typically relies on amino-acid sequence databases to provide a set of biologically relevant peptides to examine. A key subproblem, then, for amino-acid sequence database search engines that analyze tandem mass spectra is to efficiently generate all the peptide candidates from a sequence database with mass equal to one of a large set of observed peptide masses. We demonstrate that to solve the problem efficiently, we must deal with substring redundancy in the amino-acid sequence database and focus our attention on looking up the observed peptide masses quickly. We show that it is possible, with some preprocessing and memory overhead, to solve the peptide candidate generation problem in time asymptotically proportional to the size of the sequence database and the number of peptide candidates output.

Cite

CITATION STYLE

APA

Edwards, N., & Lippert, R. (2002). Generating peptide candidates from amino-acid sequence databases for protein identification via mass spectrometry. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 2452, pp. 68–81). Springer Verlag. https://doi.org/10.1007/3-540-45784-4_6

Generating peptide candidates from amino-acid sequence databases for protein identification via mass spectrometry

Abstract

Cite

Register to see more suggestions