Pathway Activity Score Learning for Dimensionality Reduction of Gene Expression Data

3Citations
Citations of this article
10Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Molecular gene-expression datasets consist of samples with tens of thousands of measured quantities (e.g., high dimensional data). However, there exist lower-dimensional representations that retain the useful information. We present a novel algorithm for such dimensionality reduction called Pathway Activity Score Learning (PASL). The major novelty of PASL is that the constructed features directly correspond to known molecular pathways and can be interpreted as pathway activity scores. Hence, unlike PCA and similar methods, PASL’s latent space has a relatively straight-forward biological interpretation. As a use-case, PASL is applied on two collections of breast cancer and leukemia gene expression datasets. We show that PASL does retain the predictive information for disease classification on new, unseen datasets, as well as outperforming PLIER, a recently proposed competitive method. We also show that differential activation pathway analysis provides complementary information to standard gene set enrichment analysis. The code is available at https://github.com/mensxmachina/PASL.

Cite

CITATION STYLE

APA

Karagiannaki, I., Pantazis, Y., Chatzaki, E., & Tsamardinos, I. (2020). Pathway Activity Score Learning for Dimensionality Reduction of Gene Expression Data. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12323 LNAI, pp. 246–261). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-61527-7_17

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free