Restriction-site associated DNA sequencing (RAD) has emerged as a powerful marker system for studying genome-wide DNA polymorphisms using next-generation sequencing. A recent technical facilitation of RAD is double-digest RAD (ddRAD), which utilizes two restriction enzymes for library preparation. The more flexible and balanced ddRAD allows analysis of genomic loci in hundreds of individuals. However, in contrast to paired-end sequencing of traditional RAD libraries, PCR duplicates cannot be detected with ddRAD. This is a concern because duplicates can contribute substantially to read coverage data and erroneously inflate the proportion of homozygous loci (allele dropout). Allele dropout can bias population genetic parameter inference and complicate the detection of outlier loci under selection. Here we outline a simple and straightforward approach to detecting PCR duplicates from ddRAD libraries. Our approach introduces a degenerate base region (DBR, 12,288 unique combinations) in the sequencing adapter. We demonstrate the high efficiency and low rate of false positives in simulations. In addition, a pilot study was performed to test this approach on six aquatic invertebrates, sequenced on a HiSeq 2500 sequencer. The reads of the ddRAD libraries consisted of 33.48% PCR duplicates distributed on 19.40% of the loci. A disproportionate number of PCR duplicates were detected in only 4.66% of the loci. While this should not be a concern for general parameter inference, outlier loci detection in particular would be improved by the DBR technique. Given the easy and straightforward application of the technique in other RAD protocols as well, we suggest that DBR regions should generally be included in PCR-based RAD studies.
Mendeley saves you time finding and organizing research
Choose a citation style from the tabs below