Genome-wide association (GWA) studies provide large amounts of high-dimensional data. GWA studies aim to identify variables that increase the risk for a given phenotype. Univariate examinations have provided some insights, but it appears that most diseases are affected by interactions of multiple factors, which can only be identified through a multivariate analysis. However, multivariate analysis on the discrete, high-dimensional and low-sample-size GWA data is made more difficult by the presence of random effects and nonspecific coupling. In this work, we investigate the suitability of three standard techniques (p-values, SVM, PCA) for analyzing GWA data on several simulated datasets. We compare these standard techniques against a sparse coding approach; we demonstrate that sparse coding clearly outperforms the other approaches and can identify interacting factors in far higher-dimensional datasets than the other three approaches. © 2010 Springer-Verlag Berlin Heidelberg.
CITATION STYLE
Brænne, I., Labusch, K., & Madany Mamlouk, A. (2010). Sparse coding for feature selection on genome-wide association data. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6352 LNCS, pp. 337–346). https://doi.org/10.1007/978-3-642-15819-3_44
Mendeley helps you to discover research relevant for your work.