Using a novel negative selection inspired anomaly detection algorithm to identify corrupted ribo-seq and RNA-seq samples

2Citations
Citations of this article
11Readers
Mendeley users who have this article in their library.

Abstract

RNA-seq and Ribo-seq are popular techniques for quantifying cellular transcription and translation. These experiments use next-generation sequencing to produce genome-wide high-resolution snapshots of the total populations of mRNAs and translating ribosomes within the investigated samples. When performed in concert, these experiments yield valuable information about protein synthesis rates and translational efficiency. Due to their intricate experimental protocols and demanding data processing requirements, quality control and analysis of such experiments are often challenging. Therefore, methods for accurately assessing data quality, and for identifying contaminated samples, are greatly needed. In the following we use a novel negative selection inspired algorithm called Boundary Detection Using Nearest Neighbors (BDUNN), for the identification of corrupted samples. Our algorithm constructs a detector set and reduced training set that defines the boundaries between normal data points and potential anomalies. Subsequently, a nearest neighbor algorithm is used to classify unseen observations. We compare the performance of BDUNN with other popular negative selection and one-class classification algorithms, and show that BDUNN is capable of accurately and efficiently detecting anomalies in standard anomaly detection datasets and simulated RNA-seq and Ribo-seq data sets. Furthermore, we have implemented our method within an existing R Shiny platform for analyzing RNA-seq an Ribo-seq datasets, which permits downstream analysis of anomalous samples.

Cite

CITATION STYLE

APA

Perkins, P., & Heber, S. (2019). Using a novel negative selection inspired anomaly detection algorithm to identify corrupted ribo-seq and RNA-seq samples. In ACM-BCB 2019 - Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics (pp. 457–465). Association for Computing Machinery, Inc. https://doi.org/10.1145/3307339.3342169

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free