KARAJ: An Efficient Adaptive Multi-Processor Tool to Streamline Genomic and Transcriptomic Sequence Data Acquisition

Mahdieh Labani; Amin Beheshti; Nigel H. Lovell; Hamid Alinejad-Rokny; Ali Afrasiabi

Journal ArticleOPEN ACCESS

KARAJ: An Efficient Adaptive Multi-Processor Tool to Streamline Genomic and Transcriptomic Sequence Data Acquisition

International Journal of Molecular Sciences (2022) 23(22)

DOI: 10.3390/ijms232214418

2Citations

5Readers

Abstract

Here we developed KARAJ, a fast and flexible Linux command-line tool to automate the end-to-end process of querying and downloading a wide range of genomic and transcriptomic sequence data types. The input to KARAJ is a list of PMCIDs or publication URLs or various types of accession numbers to automate four tasks as follows; firstly, it provides a summary list of accessible datasets generated by or used in these scientific articles, enabling users to select appropriate datasets; secondly, KARAJ calculates the size of files that users want to download and confirms the availability of adequate space on the local disk; thirdly, it generates a metadata table containing sample information and the experimental design of the corresponding study; and lastly, it enables users to download supplementary data tables attached to publications. Further, KARAJ provides a parallel downloading framework powered by Aspera connect which reduces the downloading time significantly.

Author supplied keywords

Cite

CITATION STYLE

APA

Labani, M., Beheshti, A., Lovell, N. H., Alinejad-Rokny, H., & Afrasiabi, A. (2022). KARAJ: An Efficient Adaptive Multi-Processor Tool to Streamline Genomic and Transcriptomic Sequence Data Acquisition. International Journal of Molecular Sciences, 23(22). https://doi.org/10.3390/ijms232214418

KARAJ: An Efficient Adaptive Multi-Processor Tool to Streamline Genomic and Transcriptomic Sequence Data Acquisition

Abstract

Author supplied keywords

Cite

Register to see more suggestions