Abstract
In order to help understand how the genes are affected by different disease conditions in a biological system, clustering is typically performed to analyze gene expression data. In this paper, we propose to solve the clustering problem using a graph theoretical approach, and apply a novel graph partitioning model - Isoperimetric Graph Partitioning (IGP), to group biological samples from gene expression data. The IGP algorithm has several advantages compared to the well-established Spectral Graph Partitioning (SGP) model. First, IGP requires a simple solution to a sparse system of linear equations instead of the eigen-problem in the SGP model. Second, IGP avoids degenerate cases produced by spectral approach to achieve a partition with higher accuracy. Moreover, we integrate unsupervised gene selection into the proposed approach through two-way ordering of gene expression data, such that we can eliminate irrelevant or redundant genes in the data and obtain an improved clustering result. We evaluate our approach on several well-known problems involving gene expression profiles of colon cancer and leukemia subtypes. Our experiment results demonstrate that IGP constantly outperforms SGP and produces a better result that is closer to the original labeling of sample sets provided by domain experts. Furthermore, the clustering accuracy is improved significantly when IGP is integrated with the unsupervised gene (feature) selection. ©2007 IEEE.
Cite
CITATION STYLE
Chen, Y., Dong, M., & Rege, M. (2007). Gene expression clustering: A novel graph partitioning approach. In IEEE International Conference on Neural Networks - Conference Proceedings (pp. 1542–1547). https://doi.org/10.1109/IJCNN.2007.4371187
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.