Polymerase chain reaction and different barcoding methods commonly used for plant identification from metagenomics samples are based on the amplification of a limited number of pre-selected barcoding regions. These methods are often inapplicable due to DNA degradation, low amplification success or low species discriminative power of selected genomic regions. Here we introduce a method for the rapid identification of plant taxon-specific k-mers, that is applicable for the fast detection of plant taxa directly from raw sequencing reads without aligning, mapping or assembling the reads. We identified more than 800 Solanum lycopersicum specific k-mers (32 nucleotides in length) from 42 different chloroplast genome regions using the developed method. We demonstrated that identified k-mers are also detectable in whole genome sequencing raw reads from S. lycopersicum. Also, we demonstrated the usability of taxon-specific k-mers in artificial mixtures of sequences from closely related species. Developed method offers a novel strategy for fast identification of taxon-specific genome regions and offers new perspectives for detection of plant taxa directly from sequencing raw reads.
CITATION STYLE
Raime, K., & Remm, M. (2018). Method for the identification of taxon-specific k-mers from chloroplast genome: A case study on tomato plant (Solanum lycopersicum). Frontiers in Plant Science, 9. https://doi.org/10.3389/fpls.2018.00006
Mendeley helps you to discover research relevant for your work.