Gene expression dataset consists of a complex association of gene patterns consisting of tens or hundreds samples. Finding relevant biological information for different tasks from this complex data is really a tedious job. Text mining approaches like classification and clustering are used in the literature to discover relevant aspects of dataset for many biological applications. Gene expression also contains irrelevant data known as noise. In this paper for efficient clustering results, a very powerful dimension reduction technique is presented as preprocessing step to improve clustering results and also cluster the gene expression samples into relevant classes. In this study, the concept of nonnegative matrix factorization and non-smooth nonnegative matrix factorization, which is an extended algorithm of the basic-NNMF algorithm is used for sparser matrix factorization, and the factorization differences are observed. Later on, the performance and the accuracy of K-means, NNMF, and NS-NNMF are compared, and NS-NNMF has shown highest accuracy.
CITATION STYLE
Kherwa, P., Bansal, P., Singh, S., & Gupta, T. (2020). Efficient Clustering Using Nonnegative Matrix Factorization for Gene Expression Dataset. In Advances in Intelligent Systems and Computing (Vol. 1082, pp. 179–190). Springer. https://doi.org/10.1007/978-981-15-1081-6_15
Mendeley helps you to discover research relevant for your work.