LAK: Lasso and K-Means Based Single-Cell RNA-Seq Data Clustering Analysis

16Citations
Citations of this article
13Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

The single-cell RNA sequencing provides a way to obtain marker genes of different cells, which lays the foundation for discovering new cell types. The general strategy of achieving this goal is to build a clustering pipeline and derive differentially expressed genes, followed by the cell type enrichment analysis and driving force analysis. Throughout the entire analysis process, clustering models and appropriate methods of dimension reduction are two vital and challenging tasks. In this study, we present a novel method LAK (a computational pipeline for single-cell RNA-seq data clustering analysis using Lasso and K-means based feature selection method) that can be applied to single-cell RNA-seq data by selecting the candidate genes. To deal with the sparse high-dimensional data, we integrated Lasso penalty into clustering method for single-cell RNA-seq data as the feature selection method, which extracts out the genes that have an actual effect on clustering. We also improved the parameter selection algorithm to search the appropriate parameters automatically by binary search according to the size of the data. Compared with other computational approaches, LAK obtains a better performance in reliability, stability, convenience and accuracy applied to the real datasets, the simulation data, and the datasets with a large number of dropout events.

Cite

CITATION STYLE

APA

Hua, J., Liu, H., Zhang, B., & Jin, S. (2020). LAK: Lasso and K-Means Based Single-Cell RNA-Seq Data Clustering Analysis. IEEE Access, 8, 129679–129688. https://doi.org/10.1109/ACCESS.2020.3008681

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free