Identification of protein coding regions in RNA transcripts

Shiyuyun Tang; Alexandre Lomsadze; Mark Borodovsky

Journal ArticleOPEN ACCESS

Identification of protein coding regions in RNA transcripts

Nucleic Acids Research (2015) 43(12)

DOI: 10.1093/nar/gkv227

300Citations

217Readers

Abstract

Massive parallel sequencing of RNA transcripts by next-generation technology (RNA-Seq) generates critically important data for eukaryotic gene discovery. Gene finding in transcripts can be done by statistical (alignment-free) as well as by alignment-based methods. We describe a new tool, GeneMarkS-T, for ab initio identification of protein-coding regions in RNA transcripts. The algorithm parameters are estimated by unsupervised training which makes unnecessary manually curated preparation of training sets. We demonstrate that (i) the unsupervised training is robust with respect to the presence of transcripts assembly errors and (ii) the accuracy of GeneMarkS-T in identifying protein-coding regions and, particularly, in predicting translation initiation sites in modelled as well as in assembled transcripts compares favourably to other existing methods.

References Powered by Scopus

View more at Scopus

Cited by Powered by Scopus

View more at Scopus

Cite

CITATION STYLE

APA

Tang, S., Lomsadze, A., & Borodovsky, M. (2015). Identification of protein coding regions in RNA transcripts. Nucleic Acids Research, 43(12). https://doi.org/10.1093/nar/gkv227

Readers' Seniority

PhD / Post grad / Masters / Doc 89

61%

Researcher 39

27%

Professor / Associate Prof. 14

10%

Lecturer / Post doc 5

Readers' Discipline

Agricultural and Biological Sciences 72

49%

Biochemistry, Genetics and Molecular Bi... 54

37%

Computer Science 11

Engineering 9

Article Metrics

Mentions

References: 1

View details >

Identification of protein coding regions in RNA transcripts

Abstract

References Powered by Scopus

RNA-Seq: A revolutionary tool for transcriptomics

Velvet: Algorithms for de novo short read assembly using de Bruijn graphs

Prodigal: Prokaryotic gene recognition and translation initiation site identification

Cited by Powered by Scopus

EnTAP: Bringing faster and smarter functional annotation to non-model eukaryotic transcriptomes

Transcriptome, proteome and draft genome of Euglena gracilis

A Chromosome-Scale Genome Assembly of Paper Mulberry (Broussonetia papyrifera) Provides New Insights into Its Forage and Papermaking Usage

Register to see more suggestions

Cite

Readers' Seniority

Readers' Discipline

Article Metrics