The main aspect of bioinformatics is to make an understanding between microarray data with biological processes as much as possible to ensure the development and application of data mining techniques. Microarray dataset is high voluminous containing huge genes, most of these are irrelevant regarding cancer classification. These irrelevant genes should be filtered out from the dataset before applying it in cancer classification system. In this paper, a clustering algorithm is used to group the genes whose similar expressions suggest that they may be co-regulated. Once the clusters are obtained, the biological knowledge is investigated for the genes associated with the clusters. A quality-based partition is determined by the co-expressed genes that have been incorporated with similar biological knowledge. Gene Ontology (GO) annotations are used to link the clusters to identify the biologically meaningful genes within the clusters. In the next phase, the fold-change method is used to pick up the differentially expressed genes from selected biologically meaningful genes within the clusters. These selected genes are termed as informative genes. The efficiency of the method is investigated on publicly accessible microarray data with the help of some popular classifiers.
CITATION STYLE
Pati, S. K., Mallick, S., Chakraborty, A., & Das, A. (2019). Informative gene selection using clustering and gene ontology. In Advances in Intelligent Systems and Computing (Vol. 813, pp. 417–427). Springer Verlag. https://doi.org/10.1007/978-981-13-1498-8_37
Mendeley helps you to discover research relevant for your work.