Abstract
Summary: Identifying distinctive taxa for micro-biome-related diseases is considered key to the establishment of diagnosis and therapy options in precision medicine and imposes high demands on the accuracy of micro-biome analysis techniques. We propose an alignment- and reference- free subsequence based 16S rRNA data analysis, as a new paradigm for micro-biome phenotype and biomarker detection. Our method, called DiTaxa, substitutes standard operational taxonomic unit (OTU)-clustering by segmenting 16S rRNA reads into the most frequent variable-length subsequences. We compared the performance of DiTaxa to the state-of-the-art methods in phenotype and biomarker detection, using human-associated 16S rRNA samples for periodontal disease, rheumatoid arthritis and inflammatory bowel diseases, as well as a synthetic benchmark dataset. DiTaxa performed competitively to the k-mer based state-of-the-art approach in phenotype prediction while outperforming the OTU-based state-of-the-art approach in finding biomarkers in both resolution and coverage evaluated over known links from literature and synthetic benchmark datasets.
Cite
CITATION STYLE
Asgari, E., Münch, P. C., Lesker, T. R., McHardy, A. C., & Mofrad, M. R. K. (2019). DiTaxa: Nucleotide-pair encoding of 16S rRNA for host phenotype and biomarker detection. Bioinformatics, 35(14), 2498–2500. https://doi.org/10.1093/bioinformatics/bty954
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.