BGData - A suite of R packages for genomic analysis with big data

26Citations
Citations of this article
72Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

We created a suite of packages to enable analysis of extremely large genomic data sets (potentially millions of individuals and millions of molecular markers) within the R environment. The package offers: a matrix-like interface for .bed files (PLINK’s binary format for genotype data), a novel class of linked arrays that allows linking data stored in multiple files to form a single array accessible from the R computing environment, methods for parallel computing capabilities that can carry out computations on very large data sets without loading the entire data into memory and a basic set of methods for statistical genetic analyses. The package is accessible through CRAN and GitHub. In this note, we describe the classes and methods implemented in each of the packages that make the suite and illustrate the use of the packages using data from the UK Biobank.

Cite

CITATION STYLE

APA

Grueneberg, A., & de los Campos, G. (2019). BGData - A suite of R packages for genomic analysis with big data. G3: Genes, Genomes, Genetics, 9(5), 1377–1383. https://doi.org/10.1534/g3.119.400018

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free