Efficient data reduction for large-scale genetic mapping

2Citations
Citations of this article
5Readers
Mendeley users who have this article in their library.

Abstract

We present a fast and accurate algorithm for reducing large-scale genetic marker data to a smaller, less noisy, and more complete set of bins, representing uniquely identifiable locations on a chromosome. Our experimental results on real and synthetic data show that our algorithm runs in nearlinear time, allowing for the analysis of millions of markers. Our algorithm reduces the problem scale while preserving accuracy, making it feasible to use existing genetic mapping tools without resorting to complex, time-intensive pre-processing methods to filter or sample the original data set. Additionally, our approach also decreases the uncertainty in genotype calls, improving the quality of the data. Preliminary results demonstrate that existing methods for marker ordering designed for the small scale settings perform with equivalent accuracy when given our reduced bin set as input.

Cite

CITATION STYLE

APA

Strnadová-Neeley, V., Buluç, A., Chapman, J., Gilbert, J. R., Gonzalez, J., & Oliker, L. (2015). Efficient data reduction for large-scale genetic mapping. In BCB 2015 - 6th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics (pp. 126–135). Association for Computing Machinery, Inc. https://doi.org/10.1145/2808719.2808732

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free