Streamlining remote nanopore data access with slow5curl

2Citations
Citations of this article
10Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Background: As adoption of nanopore sequencing technology continues to advance, the need to maintain large volumes of raw current signal data for reanalysis with updated algorithms is a growing challenge. Here we introduce slow5curl, a software package designed to streamline nanopore data sharing, accessibility, and reanalysis. Results: Slow5curl allows a user to fetch a specified read or group of reads from a raw nanopore dataset stored on a remote server, such as a public data repository, without downloading the entire file. Slow5curl uses an index to quickly fetch specific reads from a large dataset in SLOW5/BLOW5 format and highly parallelized data access requests to maximize download speeds. Using all public nanopore data from the Human Pangenome Reference Consortium (>22 TB), we demonstrate how slow5curl can be used to quickly fetch and reanalyze raw signal reads corresponding to a set of target genes from each individual in large cohort dataset (n = 91), minimizing the time, egress costs, and local storage requirements for their reanalysis. Conclusions: We provide slow5curl as a free, open-source package that will reduce frictions in data sharing for the nanopore community: https://github.com/BonsonW/slow5curl.

Cite

CITATION STYLE

APA

Wong, B., Ferguson, J. M., Do, J. Y., Gamaarachchi, H., & Deveson, I. W. (2024). Streamlining remote nanopore data access with slow5curl. GigaScience, 13. https://doi.org/10.1093/gigascience/giae016

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free