Machine learning-driven noise separation in high variation genomics sequencing datasets

Milko Krachunov; Maria Nisheva; Dimitar Vassilev

Conference Proceedings

Machine learning-driven noise separation in high variation genomics sequencing datasets

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2018) 11089 LNAI 173-185

DOI: 10.1007/978-3-319-99344-7_16

3Citations

4Readers

Get full text

Abstract

Genomics studies have increasingly had to deal with datasets containing high variation between the sequenced nucleotide chains. This is most common in metagenomics studies and polyploid studies, where the biological nature of studied samples requires analysis of multiple variants of nearly identical sequences. The high variation makes it more difficult to determine the correct nucleotide sequences, as well as to distinguish signal from noise, producing digital results with higher error rates than the ones that can be achieved in samples with low variation. This paper presents an original pure machine learning-based approach for detecting and potentially correcting those errors. It uses a generic machine learning-based model that can be applied to different types of sequencing data with minor modifications. As presented in a separate part of this work, these models can be combined with data-specific error candidate selection to apply the models on, for a refined error discovery, but as shown here, can also be used independently.

Author supplied keywords

Cite

CITATION STYLE

APA

Krachunov, M., Nisheva, M., & Vassilev, D. (2018). Machine learning-driven noise separation in high variation genomics sequencing datasets. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11089 LNAI, pp. 173–185). Springer Verlag. https://doi.org/10.1007/978-3-319-99344-7_16

Machine learning-driven noise separation in high variation genomics sequencing datasets

Abstract

Author supplied keywords

Cite

Register to see more suggestions