An empirical comparison of dimensionality reduction methods for classifying gene and protein expression datasets

George Lee; Carlos Rodriguez; Anant Madabhushi

Conference Proceedings

An empirical comparison of dimensionality reduction methods for classifying gene and protein expression datasets

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2007) 4463 LNBI 170-181

DOI: 10.1007/978-3-540-72031-7_16

14Citations

22Readers

Get full text

Abstract

The recent explosion in availability of gene and protein expression data for cancer detection has necessitated the development of sophisticated machine learning tools for high dimensional data analysis. Previous attempts at gene expression analysis have typically used a linear dimensionality reduction method such as Principal Components Analysis (PCA). Linear dimensionality reduction methods do not however account for the inherent nonlinearity within the data. The motivar tion behind this work is to demonstrate that nonlinear dimensionality reduction methods are more adept at capturing the nonlinearity within the data compared to linear methods, and hence would result in better classification and potentially aid in the visualization and identification of new data classes. Consequently, in this paper, we empirically compare the performance of 3 commonly used linear versus 3 nonlinear dimensionality reduction techniques from the perspective of (a) distinguishing objects belonging to cancer and non-cancer classes and (b) new class discovery in high dimensional gene and protein expression studies for different types of cancer. Quantitative evaluation using a support vector machine and a decision tree classifier revealed statistically significant improvement in classification accuracy by using nonlinear dimensionality reduction methods compared to linear methods. © Springer-Verlag Berlin Heidelberg 2007.

Author supplied keywords

Cite

CITATION STYLE

APA

Lee, G., Rodriguez, C., & Madabhushi, A. (2007). An empirical comparison of dimensionality reduction methods for classifying gene and protein expression datasets. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4463 LNBI, pp. 170–181). Springer Verlag. https://doi.org/10.1007/978-3-540-72031-7_16

An empirical comparison of dimensionality reduction methods for classifying gene and protein expression datasets

Abstract

Author supplied keywords

Cite

Register to see more suggestions