Phage display empowered the development of proteins with new function and ligands for clinically relevant targets. In this report, we use next-generation sequencing to analyze phage-displayed libraries and uncover a strong bias induced by amplification preferences of phage in bacteria. This bias favors fast-growing sequences that collectively constitute <0.01% of the available diversity. Specifically, a library of 109 random 7-mer peptides (Ph.D.-7) includes a few thousand sequences that grow quickly (the 'parasites'), which are the sequences that are typically identified in phage display screens published to date. A similar collapse was observed in other libraries. Using Illumina and Ion Torrent sequencing and multiple biological replicates of amplification of Ph.D.-7 library, we identified a focused population of 770 'parasites'. In all, 197 sequences from this population have been identified in literature reports that used Ph.D.-7 library. Many of these enriched sequences have confirmed function (e.g. target binding capacity). The bias in the literature, thus, can be viewed as a selection with two different selection pressures: (i) target-binding selection, and (ii) amplification-induced selection. Enrichment of parasitic sequences could be minimized if amplification bias is removed. Here, we demonstrate that emulsion amplification in libraries of ∼106 diverse clones prevents the biased selection of parasitic clones. © 2013 The Author(s) 2013. Published by Oxford University Press.
Matochko, W. L., Cory Li, S., Tang, S. K. Y., & Derda, R. (2014). Prospective identification of parasitic sequences in phage display screens. Nucleic Acids Research, 42(3), 1784–1798. https://doi.org/10.1093/nar/gkt1104