MultiDataSet: An R package for encapsulating multiple data sets with application to omic data integration

21Citations
Citations of this article
99Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Background: Reduction in the cost of genomic assays has generated large amounts of biomedical-related data. As a result, current studies perform multiple experiments in the same subjects. While Bioconductor's methods and classes implemented in different packages manage individual experiments, there is not a standard class to properly manage different omic datasets from the same subjects. In addition, most R/Bioconductor packages that have been designed to integrate and visualize biological data often use basic data structures with no clear general methods, such as subsetting or selecting samples. Results: To cover this need, we have developed MultiDataSet, a new R class based on Bioconductor standards, designed to encapsulate multiple data sets. MultiDataSet deals with the usual difficulties of managing multiple and non-complete data sets while offering a simple and general way of subsetting features and selecting samples. We illustrate the use of MultiDataSet in three common situations: 1) performing integration analysis with third party packages; 2) creating new methods and functions for omic data integration; 3) encapsulating new unimplemented data from any biological experiment. Conclusions: MultiDataSet is a suitable class for data integration under R and Bioconductor framework.

Cite

CITATION STYLE

APA

Hernandez-Ferrer, C., Ruiz-Arenas, C., Beltran-Gomila, A., & González, J. R. (2017). MultiDataSet: An R package for encapsulating multiple data sets with application to omic data integration. BMC Bioinformatics, 18(1). https://doi.org/10.1186/s12859-016-1455-1

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free