methyLImp2: faster missing value estimation for DNA methylation data

Anna Plaksienko; Pietro Di Lena; Christine Nardini; Claudia Angelini

Journal ArticleOPEN ACCESS

methyLImp2: faster missing value estimation for DNA methylation data

Bioinformatics (2024) 40(1)

DOI: 10.1093/bioinformatics/btae001

8Citations

7Readers

Abstract

Motivation: methyLImp, a method we recently introduced for the missing value estimation of DNA methylation data, has demonstrated competitive performance in data imputation compared to the existing, general-purpose, approaches. However, imputation running time was considerably long and unfeasible in case of large datasets with numerous missing values. Results: methyLImp2 made possible computations that were previously unfeasible. We achieved this by introducing two important modifications that have significantly reduced the original running time without sacrificing prediction performance. First, we implemented a chromosome-wise parallel version of methyLImp. This parallelization reduced the runtime by several 10-fold in our experiments. Then, to handle large datasets, we also introduced a mini-batch approach that uses only a subset of the samples for the imputation. Thus, it further reduces the running time from days to hours or even minutes in large datasets.

Cite

CITATION STYLE

APA

Plaksienko, A., Lena, P. D., Nardini, C., & Angelini, C. (2024). methyLImp2: faster missing value estimation for DNA methylation data. Bioinformatics, 40(1). https://doi.org/10.1093/bioinformatics/btae001

methyLImp2: faster missing value estimation for DNA methylation data

Abstract

Cite

Register to see more suggestions