Machine Learning for Feature Selection and Cluster Analysis in Drug Utilisation Research

Sara Khalid; Daniel Prieto-Alhambra

Journal ArticleOPEN ACCESS

Machine Learning for Feature Selection and Cluster Analysis in Drug Utilisation Research

Khalid S
Prieto-Alhambra D

Current Epidemiology Reports (2019) 6(3) 364-372

DOI: 10.1007/s40471-019-00211-7

N/ACitations

33Readers

Abstract

Machine learning methods are increasingly used in health data mining. We describe current unsupervised learning methods for phenotyping and discovery and illustrate their application for detecting features and sub-groups related to drug use within a population. Patient representation or phenotyping and discovery is one of the main branches of health data analysis. Phenotyping concerns identifying features that are representative of the population from raw patient data. Discovery involves analysing these features, for example, to identify patterns in the population such as sub-groups and to predict outcomes. Most studies use unsupervised learning methods for phenotyping as they are suited for data-driven feature extraction. We describe some of the commonly used methods and demonstrate their use in feature selection followed by cluster analysis. Unsupervised learning methods can be used to extract the features of and identify sub-groups within specific populations. We demonstrate the potential of these methods and highlight the associated challenges, which researchers may find useful in understanding the suitability of these methods for analysing health data.

Cite

CITATION STYLE

APA

Khalid, S., & Prieto-Alhambra, D. (2019). Machine Learning for Feature Selection and Cluster Analysis in Drug Utilisation Research. Current Epidemiology Reports, 6(3), 364–372. https://doi.org/10.1007/s40471-019-00211-7

Machine Learning for Feature Selection and Cluster Analysis in Drug Utilisation Research

Abstract

Cite

Register to see more suggestions