Processing genome scale tabular data with wormtable

5Citations
Citations of this article
33Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Background: Modern biological science generates a vast amount of data, the analysis of which presents a major challenge to researchers. Data are commonly represented in tables stored as plain text files and require line-by-line parsing for analysis, which is time consuming and error prone. Furthermore, there is no simple means of indexing these files so that rows containing particular values can be quickly found. Results: We introduce a new data format and software library called wormtable, which provides efficient access to tabular data in Python. Wormtable stores data in a compact binary format, provides random access to rows, and enables sophisticated indexing on columns within these tables. Files written in existing formats can be easily converted to wormtable format, and we provide conversion utilities for the VCF and GTF formats.Conclusions: Wormtable's simple API allows users to process large tables orders of magnitude more quickly than is possible when parsing text. Furthermore, the indexing facilities provide efficient access to subsets of the data along with providing useful methods of summarising columns. Since third-party libraries or custom code are no longer needed to parse complex plain text formats, analysis code can also be substantially simpler as well as being uniform across different data formats. These benefits of reduced code complexity and greatly increased performance allow users much greater freedom to explore their data. © 2013 Kelleher et al.; licensee BioMed Central Ltd.

Cite

CITATION STYLE

APA

Kelleher, J., Ness, R. W., & Halligan, D. L. (2013). Processing genome scale tabular data with wormtable. BMC Bioinformatics, 14(1). https://doi.org/10.1186/1471-2105-14-356

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free