Identification of disease-distinct complex biomarker patterns by means of unsupervised machine-learning using an interactive R toolbox (Umatrix)

  • Lötsch J
  • Lerch F
  • Djaldetti R
  • et al.
N/ACitations
Citations of this article
22Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Unsupervised machine-learned analysis of cluster structures, applied using the emergent self-organizing feature maps (ESOM) combined with the unified distance matrix (U-matrix) has been shown to provide an unbiased method to identify true clusters. It outperforms classical hierarchical clustering algorithms that carry a considerable tendency to produce erroneous results. To facilitate the application of the ESOM/U-matrix method in biomedical research, we introduce the interactive R-based bioinformatics tool “Umatrix”, which enables valid identification of a biologically meaningful cluster structure in the data by training a Kohonen-type self-organizing map followed by interface-guided interactive clustering on the emergent U-matrix map. The ability to detect clinical relevant subgroups was applied to a data set comprising plasma concentrations of d = 25 lipid markers including endocannabinoids, lysophosphatidic acids, ceramides and sphingolipids acquired from n = 100 patients with Parkinson's disease and n = 100 controls. Following ESOM training, clear data structures in the high-dimensional data space were observed on the U-matrix, allowing separation of patients from controls almost perfectly. When the data structure was destroyed by Monte-Carlo random resampling, the U-matrix became unstructured and patients and controls were mixed. Obtained results are biologically plausible and supported by empirical evidence of a regulation of several classes of lipids in Parkinson's disease. Sophisticated analysis of structures in biomedical data provides a basis for the mechanistic interpretation of the observations and facilitates subsequent analyses focusing on hypothesis testing. The freely available R library “Umatrix” provides an interactive tool for broader application of unsupervised machine learning on complex biomedical data.

Cite

CITATION STYLE

APA

Lötsch, J., Lerch, F., Djaldetti, R., Tegder, I., & Ultsch, A. (2018). Identification of disease-distinct complex biomarker patterns by means of unsupervised machine-learning using an interactive R toolbox (Umatrix). Big Data Analytics, 3(1). https://doi.org/10.1186/s41044-018-0032-1

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free