We introduce a disentangled representation for cellular identity that constructs a latent cellular state from a linear combination of condition specific basis vectors that are then decoded into gene expression levels. The basis vectors are learned with a deep autoencoder model from single-cell RNA-seq data. Linear arithmetic in the disentangled representation successfully predicts nonlinear gene expression interactions between biological pathways in unobserved treatment conditions. We are able to recover the mean gene expression profiles of unobserved conditions with an average Pearson r = 0.73, which outperforms two linear baselines, one with an average r = 0.43 and another with an average r = 0.19. Disentangled representations hold the promise to provide new explanatory power for the interaction of biological pathways and the prediction of effects of unobserved conditions for applications such as combinatorial therapy and cellular reprogramming. Our work is motivated by recent advances in deep generative models that have enabled synthesis of images and natural language with desired properties from interpolation in a “latent representation” of the data.
CITATION STYLE
Wang, Z., Yeo, G. H. T., Sherwood, R., & Gifford, D. (2019). Disentangled Representations of Cellular Identity. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11467 LNBI, pp. 256–271). Springer Verlag. https://doi.org/10.1007/978-3-030-17083-7_16
Mendeley helps you to discover research relevant for your work.