High-throughput sequencing is becoming a popular research tool but carries with it considerable costs in terms of computation time, data storage and bandwidth. Meanwhile, some research applications focusing on individual genes or pathways do not necessitate processing of a full sequencing dataset. Thus, it is desirable to partition a large dataset into smaller, manageable, but relevant pieces. We present a toolkit for partitioning raw sequencing data that includes a method for extracting reads that are likely to map onto pre-defined regions of interest. We show the method can be used to extract information about genes of interest from DNA or RNA sequencing samples in a fraction of the time and disk space required to process and store a full dataset. We report speedup factors between 2.6 and 96, depending on settings and samples used. The software is available at http://www.sourceforge.net/projects/triagetools/. © 2013 The Author(s).
CITATION STYLE
Fimereli, D., Detours, V., & Konopka, T. (2013). TriageTools: Tools for partitioning and prioritizing analysis of high-throughput sequencing data. Nucleic Acids Research, 41(7). https://doi.org/10.1093/nar/gkt094
Mendeley helps you to discover research relevant for your work.