Deep sampling of gRNA in the human genome and deep-learning-informed prediction of gRNA activities

23Citations
Citations of this article
34Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Life science studies involving clustered regularly interspaced short palindromic repeat (CRISPR) editing generally apply the best-performing guide RNA (gRNA) for a gene of interest. Computational models are combined with massive experimental quantification on synthetic gRNA-target libraries to accurately predict gRNA activity and mutational patterns. However, the measurements are inconsistent between studies due to differences in the designs of the gRNA-target pair constructs, and there has not yet been an integrated investigation that concurrently focuses on multiple facets of gRNA capacity. In this study, we analyzed the DNA double-strand break (DSB)-induced repair outcomes and measured SpCas9/gRNA activities at both matched and mismatched locations using 926,476 gRNAs covering 19,111 protein-coding genes and 20,268 non-coding genes. We developed machine learning models to forecast the on-target cleavage efficiency (AIdit_ON), off-target cleavage specificity (AIdit_OFF), and mutational profiles (AIdit_DSB) of SpCas9/gRNA from a uniformly collected and processed dataset by deep sampling and massively quantifying gRNA capabilities in K562 cells. Each of these models exhibited superlative performance in predicting SpCas9/gRNA activities on independent datasets when benchmarked with previous models. A previous unknown parameter was also empirically determined regarding the “sweet spot” in the size of datasets used to establish an effective model to predict gRNA capabilities at a manageable experimental scale. In addition, we observed cell type-specific mutational profiles and were able to link nucleotidylexotransferase as the key factor driving these outcomes. These massive datasets and deep learning algorithms have been implemented into the user-friendly web service http://crispr-aidit.com to evaluate and rank gRNAs for life science studies.

Cite

CITATION STYLE

APA

Zhang, H., Yan, J., Lu, Z., Zhou, Y., Zhang, Q., Cui, T., … Ma, L. (2023). Deep sampling of gRNA in the human genome and deep-learning-informed prediction of gRNA activities. Cell Discovery, 9(1). https://doi.org/10.1038/s41421-023-00549-9

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free