Peptide arrays measure the binding intensity of a specific protein to thousands of amino acid peptides. By using peptides that cover all k-mers, a comprehensive picture of the binding spectrum is obtained. Researchers would like to measure binding to the longest k-mer possible, but are constrained by the number of peptides that can fit into a single microarray. A key challenge is designing a minimum number of peptides that cover all k-mers. Here, we suggest a novel idea to reduce the length of the sequence covering all k-mers by utilizing a unique property of the peptide synthesis process. Since the synthesis can start from both ends of the peptide template, it is enough to cover each k-mer or its reverse, and use the same template twice: in forward and reverse. Then, the computational problem is to generate a minimum length sequence that for each k-mer either contains it or its reverse. We developed an algorithm ReverseCAKE to generate such a sequence. ReverseCAKE runs in time linear in the output size and is guaranteed to produce a sequence that is longer by at most Θ(nlogn) characters compared to the optimum n. The obtained saving factor by ReverseCAKE approaches the theoretical lower bound as k increases. In addition, we formulated the problem as an integer linear program and empirically observed that the solutions obtained by ReverseCAKE are near-optimal. Through this work we enable more effective design of peptide microarrays.
CITATION STYLE
Orenstein, Y. (2018). Reverse de Bruijn: Utilizing Reverse Peptide Synthesis to Cover All Amino Acid k-mers. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10812 LNBI, pp. 154–166). Springer Verlag. https://doi.org/10.1007/978-3-319-89929-9_10
Mendeley helps you to discover research relevant for your work.