Motivation: The reliable identification of genes is a major challenge in genome research, as further analysis depends on the correctness of this initial step. With high-throughput RNA-Seq data reflecting currently expressed genes, a particularly meaningful source of information has become commonly available for gene finding. However, practical application in automated gene identification is still not the standard case. A particular challenge in including RNA-Seq data is the difficult handling of ambiguously mapped reads.Results: We present GIIRA (Gene Identification Incorporating RNA-Seq data and Ambiguous reads), a novel prokaryotic and eukaryotic gene finder that is exclusively based on a RNA-Seq mapping and inherently includes ambiguously mapped reads. GIIRA extracts candidate regions supported by a sufficient number of mappings and reassigns ambiguous reads to their most likely origin using a maximum-flow approach. This avoids the exclusion of genes that are predominantly supported by ambiguous mappings. Evaluation on simulated and real data and comparison with existing methods incorporating RNA-Seq information highlight the accuracy of GIIRA in identifying the expressed genes. © 2013 The Author 2013. Published by Oxford University Press. All rights reserved.
CITATION STYLE
Zickmann, F., Lindner, M. S., & Renard, B. Y. (2014). GIIRA - RNA-Seq driven gene finding incorporating ambiguous reads. Bioinformatics, 30(5), 606–613. https://doi.org/10.1093/bioinformatics/btt577
Mendeley helps you to discover research relevant for your work.