Abstract
Summary: Removing duplicate and near-duplicate reads, generated by high-throughput sequencing technologies, is able to reduce computational resources in downstream applications. Here we develop minirmd, a de novo tool to remove duplicate reads via multiple rounds of clustering using different length of minimizer. Experiments demonstrate that minirmd removes more near-duplicate reads than existing clustering approaches and is faster than existing multi-core tools. To the best of our knowledge, minirmd is the first tool to remove near-duplicates on reverse-complementary strand.
Cite
CITATION STYLE
Liu, Y., Zhang, X., Zou, Q., & Zeng, X. (2021). Minirmd: accurate and fast duplicate removal tool for short reads via multiple minimizers. Bioinformatics, 37(11), 1604–1606. https://doi.org/10.1093/bioinformatics/btaa915
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.