Structure-Based Whole Genome Realignment Reveals Many Novel Non-coding RNAs

  • Will S
  • Yu M
  • Berger B
N/ACitations
Citations of this article
5Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Recent genome-wide computational screens that searchfor conservation of RNA secondary structure in wholegenome alignments (WGAs) have predicted thousands ofstructural non-coding RNAs (ncRNAs). The sensitivityof such approaches, however, is limited due to theirreliance on sequence-based whole- genome aligners,which regularly misalign structural ncRNAs. Thissuggests that many more structural ncRNAs may remainundetected. Structure-based align- ment, which couldincrease the sensitivity, has been prohibitive forgenome- wide screens due to its extremecomputational costs. Breaking this barrier, wepresent the pipeline REAPR (RE-Alignment for de novoPrediction of structural ncRNA) that realigns wholegenomes based on RNA sequence and structure and thenevaluates the realignments for potential structuralncRNAs with a ncRNA predictor such as RNAz 2.0. Forefficiency of the pipeline, we develop a novelbanding realignment algorithm for the RNA multiplealignment tool LocARNA. This allows us to performvery fast structure-based realignment within limiteddeviation of the original multiple alignment fromthe WGA. We apply REAPR to the complete twelveDrosophila WGAs to predict ncRNAs across all theseDrosophila species. Compared to direct predictionfrom the original WGA at the same False DiscoveryRate (FDR), we predict twice as many high-confidencencRNA candidates in D.melanogaster while less thandoubling the run-time. As a novelty in ncRNAprediction, we control the FDR, going beyond theusual a posteriori FDR estimation. Applying thesequence-based alignment tool Muscle forrealignment, we demonstrate that structure-basedmethods are necessary for effective prediction oforiginally misaligned ncRNAs. Comparing to recentscreens of Drosophila and ENCODE we show that REAPRoutperforms the widely-used de novo predictors RNAz,EvoFold, and CMfinder. Finally, we reveal, with highconfidence, a putative structural motif in the longncRNA roX1 of D.melanogaster, known to regulate Xchromosome dosage compensation in maleies. Interestingly, we recapitulate the Drosophilaphylogeny, based on co-predicted ncRNAs across allgenomes.

Cite

CITATION STYLE

APA

Will, S., Yu, M., & Berger, B. (2012). Structure-Based Whole Genome Realignment Reveals Many Novel Non-coding RNAs (pp. 341–341). https://doi.org/10.1007/978-3-642-29627-7_35

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free