Sigmoni: classification of nanopore signal with a compressed pangenome index

Vikram S. Shivakumar; Omar Y. Ahmed; Sam Kovaka; Mohsen Zakeri; Ben Langmead

Journal ArticleOPEN ACCESS

Sigmoni: classification of nanopore signal with a compressed pangenome index

Bioinformatics (2024) 40 i287-i296

DOI: 10.1093/bioinformatics/btae213

18Citations

8Readers

Abstract

Summary: Improvements in nanopore sequencing necessitate efficient classification methods, including pre-filtering and adaptive sampling algorithms that enrich for reads of interest. Signal-based approaches circumvent the computational bottleneck of basecalling. But past methods for signal-based classification do not scale efficiently to large, repetitive references like pangenomes, limiting their utility to partial references or individual genomes. We introduce Sigmoni: a rapid, multiclass classification method based on the r-index that scales to references of hundreds of Gbps. Sigmoni quantizes nanopore signal into a discrete alphabet of picoamp ranges. It performs rapid, approximate matching using matching statistics, classifying reads based on distributions of picoamp matching statistics and co-linearity statistics, all in linear query time without the need for seed-chain-extend. Sigmoni is 10-100× faster than previous methods for adaptive sampling in host depletion experiments with improved accuracy, and can query reads against large microbial or human pangenomes. Sigmoni is the first signal-based tool to scale to a complete human genome and pangenome while remaining fast enough for adaptive sampling applications.

Cite

CITATION STYLE

APA

Shivakumar, V. S., Ahmed, O. Y., Kovaka, S., Zakeri, M., & Langmead, B. (2024). Sigmoni: classification of nanopore signal with a compressed pangenome index. Bioinformatics, 40, i287–i296. https://doi.org/10.1093/bioinformatics/btae213

Sigmoni: classification of nanopore signal with a compressed pangenome index

Abstract

Cite

Register to see more suggestions