Automatic Phenotyping by a Seed-guided Topic Model

14Citations
Citations of this article
19Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Electronic health records (EHRs) provide rich clinical information and the opportunities to extract epidemiological patterns to understand and predict patient disease risks with suitable machine learning methods such as topic models. However, existing topic models do not generate identifiable topics each predicting a unique phenotype. One promising direction is to use known phenotype concepts to guide topic inference. We present a seed-guided Bayesian topic model called MixEHR-Seed with 3 contributions: (1) for each phenotype, we infer a dual-form of topic distribution: a seed-topic distribution over a small set of key EHR codes and a regular topic distribution over the entire EHR vocabulary; (2) we model age-dependent disease progression as Markovian dynamic topic priors; (3) we infer seed-guided multi-modal topics over distinct EHR data types. For inference, we developed a variational inference algorithm. Using MixEHR-Seed, we inferred 1569 PheCode-guided phenotype topics from an EHR database in Quebec, Canada covering 1.3 million patients for up to 20-year follow-up with 122 million records for 8539 and 1126 unique diagnostic and drug codes, respectively. We observed (1) accurate phenotype prediction by the guided topics, (2) clinically relevant PheCode-guided disease topics, (3) meaningful age-dependent disease prevalence. Source code is available at GitHub: https://github.com/li-lab-mcgill/MixEHR-Seed.

Cite

CITATION STYLE

APA

Song, Z., Hu, Y., Verma, A., Buckeridge, D. L., & Li, Y. (2022). Automatic Phenotyping by a Seed-guided Topic Model. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 4713–4723). Association for Computing Machinery. https://doi.org/10.1145/3534678.3542675

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free