Statistical issues in binding site identification through CLIP-seq

1Citations
Citations of this article
5Readers
Mendeley users who have this article in their library.

Abstract

With the advent and development of CLIP-seq technologies, a growing number of CLIP-seq experiments are being performed to identify the targets of RNA-binding proteins and understand the regulation mechanism of these proteins. Although broad similarities exist between CLIPseq and ChIP-seq, statistical methods developed to identify binding sites from ChIP-seq data are not directly applicable to CLIP-seq data because of some differences between the two technologies. First, transcript abundance has a large impact on CLIP-seq results, and needs to be accounted for when analyzing CLIP-seq data. Second, mutations near the binding sites from CLIP-seq data offer valuable information that can be incorporated in analysis. Other differences arise from the ability of RNA to form complex secondary structures and from many other technical aspects of the two purification protocols. To date, no systematic studies have been conducted to investigate the general statistical properties of CLIP-seq data, the merits of including RNA-seq as a matching control, and the performance of different binding site identification methods for CLIP-seq data. In this study, we performed a comprehensive evaluation of various statistical issues in using CLIP-seq data to identify RNA-protein binding sites. We demonstrate the value of RNA-seq data in background estimation and peak calling. We show that the large dispersion in CLIP-seq data compared to ChIPseq data is the main reason for the difficulty in peak calling in the former. Using both real and simulated data, we also show the importance of biological/technical replicates and of combining mutation and peak analysis to accurately identify binding sites from CLIP-seq data.

Cite

CITATION STYLE

APA

Chen, X., Chung, D., Stefani, G., Slack, F. J., & Zhao, H. (2015). Statistical issues in binding site identification through CLIP-seq. Statistics and Its Interface, 8(4), 419–436. https://doi.org/10.4310/SII.2015.v8.n4.a2

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free