Unsupervised learning for medical data: A review of probabilistic factorization methods

5Citations
Citations of this article
13Readers
Mendeley users who have this article in their library.

Abstract

We review popular unsupervised learning methods for the analysis of high-dimensional data encountered in, for example, genomics, medical imaging, cohort studies, and biobanks. We show that four commonly used methods, principal component analysis, K-means clustering, nonnegative matrix factorization, and latent Dirichlet allocation, can be written as probabilistic models underpinned by a low-rank matrix factorization. In addition to highlighting their similarities, this formulation clarifies the various assumptions and restrictions of each approach, which eases identifying the appropriate method for specific applications for applied medical researchers. We also touch upon the most important aspects of inference and model selection for the application of these methods to health data.

Cite

CITATION STYLE

APA

Neijzen, D., & Lunter, G. (2023, December 30). Unsupervised learning for medical data: A review of probabilistic factorization methods. Statistics in Medicine. John Wiley and Sons Ltd. https://doi.org/10.1002/sim.9924

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free