Recognition of CRISPR/Cas9 off-target sites through ensemble learning of uneven mismatch distributions

39Citations
Citations of this article
64Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Motivation CRISPR/Cas9 is driving a broad range of innovative applications from basic biology to biotechnology and medicine. One of its current issues is the effect of off-target editing that should be critically resolved and should be completely avoided in the ideal use of this system. Results We developed an ensemble learning method to detect the off-target sites of a single guide RNA (sgRNA) from its thousands of genome-wide candidates. Nucleotide mismatches between on-target and off-target sites have been studied recently. We confirm that there exists strong mismatch enrichment and preferences at the 5′-end close regions of the off-target sequences. Comparing with the on-target sites, sequences of no-editing sites can be also characterized by GC composition changes and position-specific mismatch binary features. Under this novel space of features, an ensemble strategy was applied to train a prediction model. The model achieved a mean score 0.99 of Aera Under Receiver Operating Characteristic curve and a mean score 0.45 of Aera Under Precision-Recall curve in cross-validations on big datasets, outperforming state-of-the-art methods in various test scenarios. Our predicted off-target sites also correspond very well to those detected by high-throughput sequencing techniques. Especially, two case studies for selecting sgRNAs to cure hearing loss and retinal degeneration partly prove the effectiveness of our method. Availability and implementation The python and matlab version of source codes for detecting off-target sites of a given sgRNA and the supplementary files are freely available on the web at https://github.com/penn-hui/OfftargetPredict. Supplementary information Supplementary data are available at Bioinformatics online.

Cite

CITATION STYLE

APA

Peng, H., Zheng, Y., Zhao, Z., Liu, T., & Li, J. (2018). Recognition of CRISPR/Cas9 off-target sites through ensemble learning of uneven mismatch distributions. In Bioinformatics (Vol. 34, pp. i757–i765). Oxford University Press. https://doi.org/10.1093/bioinformatics/bty558

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free