Efficiently scaling genomic variant search indexes to thousands of samples is computationally challenging due to the presence of multiple coordinate systems to avoid reference biases. We present VariantStore, a system that indexes genomic variants from multiple samples using a variation graph and enables variant queries across any sample-specific coordinate system. We show the scalability of VariantStore by indexing genomic variants from the TCGA project in 4 h and the 1000 Genomes project in 3 h. Querying for variants in a gene takes between 0.002 and 3 seconds using memory only 10% of the size of the full representation.
CITATION STYLE
Pandey, P., Gao, Y., & Kingsford, C. (2021). VariantStore: an index for large-scale genomic variant search. Genome Biology, 22(1). https://doi.org/10.1186/s13059-021-02442-8
Mendeley helps you to discover research relevant for your work.