A DNA Barcoding system integrating multigene sequence data

  • Chesters D
  • Zheng W
  • Zhu C
  • 63


    Mendeley users who have this article in their library.
  • 9


    Citations of this article.


* A number of systems have been developed for taxonomic identification of DNA sequence data. However, in eukaryotes, these systems are largely based on single predefined genes, and thus are vulnerable to biases from limited character sampling, and are not able to identify most sequences of genomic origin. * We here demonstrate an implementation for multigene DNA barcoding. First, a reference framework is built of frequently sequenced loci. Query sequence data are then organized by excising sequences homologous to references and assigning species names where the level of sequence similarity between query and reference falls within the (gene-appropriate) level of intraspecific variation usually observed. The approach is compared to some existing methods including ‘bagpipe_phylo’, a re-implementation for taxonomic assignment on phylogenies. * Seventy-eight per cent of the species and 94% of the genera known to be present in arthropod test queries were correctly inferred by the proposed multigene system. Most critically, the rate of species identification was increased over using a COI-only approach. Twenty-four per cent of species in the queries were found only in non-COI genes, with no clear reduction in the accuracy of species assignment at many of these other loci. Similarly, additional species assignments were made for a pooled metagenomic data set using non-COI columns. On a smaller query data set of 273 bee sequences, the accuracy of species assignment using modified calculation of distances was indistinguishable from phylogeny-based taxonomic identification. * Standardized single fragment DNA barcoding remains an invaluable tool in species identification for PCR-generated sequence data. The approach developed here supplements the established species-dense DNA barcode backbone with other genomic data, reducing error via integration of independent genetic loci and permitting additional identifications for non-barcode fragments. The latter will be particularly relevant in monitoring of community genomics using next-generation sequencing platforms.

Author-supplied keywords

  • Biodiversity monitoring
  • Metagenomics
  • Species clustering

Get free article suggestions today

Mendeley saves you time finding and organizing research

Sign up here
Already have an account ?Sign in

Find this document


  • Douglas Chesters

  • Wei Min Zheng

  • Chao Dong Zhu

Cite this document

Choose a citation style from the tabs below

Save time finding and organizing research with Mendeley

Sign up for free