GMM-Based Classification of Genome Sequence

  • Akhtar M
  • Ambikairajah E
  • 4


    Mendeley users who have this article in their library.
  • N/A


    Citations of this article.


At present many digital signal processing based techniques are available to predict genomic protein coding regions. However, accurate identification of these regions at the level of individual nucleotides remains a challenge. In this paper, we propose the novel use of a multi-dimensional feature and Gaussian mixture models for the classification between protein coding and non-coding nucleotides. Employing signal processing based time-domain and frequency-domain features, the novel system described herein is shown to produce identification accuracies of more than 75% and 79% respectively for protein coding and non-coding nucleotides, when evaluated on the GENSCAN data set.

Get free article suggestions today

Mendeley saves you time finding and organizing research

Sign up here
Already have an account ?Sign in

Find this document


  • Mahmood Akhtar

  • Eliathamby and Julien Epps Ambikairajah

Cite this document

Choose a citation style from the tabs below

Save time finding and organizing research with Mendeley

Sign up for free