CNCTDISCRIMINATOR: CODING AND NONCODING TRANSCRIPT DISCRIMINATOR — AN EXCURSION THROUGH HYPOTHESIS LEARNING AND ENSEMBLE LEARNING APPROACHES

  • BISWAS A
  • ZHANG B
  • WU X
 et al. 
  • 10

    Readers

    Mendeley users who have this article in their library.
  • 3

    Citations

    Citations of this article.

Abstract

The statistics about the open reading frames, the base compositions and the properties of the predicted secondary structures have potential to address the problem of discriminating coding and noncoding transcripts. Again, the Next Generation Sequencing platform, RNA-seq, provides us bounty of data from which expression profiles of the transcripts can be extracted which urged us adding a new set of dimension in this classification task. In this paper, we proposed CNCTDiscriminator -- a coding and noncoding transcript discriminating system where we applied the integration of these four categories of features about the transcripts. The feature integration was done using both hypothesis learning and feature specific ensemble learning approaches. The CNCTDiscriminator model which was trained with composition and ORF features outperforms (precision 83.86%, recall 82.01%) other three popular methods -- CPC (precision 98.31%, recall 25.95%), CPAT (precision 97.74%, recall 52.50%) and PORTRAIT (precision 84.37%, recall 73.2%) when applied to an independent benchmark dataset. However, the CNCTDiscriminator model that was trained using the ensemble approach shows comparable performance (precision 89.85%, recall 71.08%).

Get free article suggestions today

Mendeley saves you time finding and organizing research

Sign up here
Already have an account ?Sign in

Find this document

Authors

  • ASHIS KUMER BISWAS

  • BAOJU ZHANG

  • XIAOYONG WU

  • JEAN X. GAO

Cite this document

Choose a citation style from the tabs below

Save time finding and organizing research with Mendeley

Sign up for free