LAK: Lasso and K-Means Based Single-Cell RNA-Seq Data Clustering Analysis

Jiao Hua; Hongkun Liu; Boyang Zhang; Shuilin Jin

Journal ArticleOPEN ACCESS

LAK: Lasso and K-Means Based Single-Cell RNA-Seq Data Clustering Analysis

IEEE Access (2020) 8 129679-129688

DOI: 10.1109/ACCESS.2020.3008681

16Citations

13Readers

Abstract

The single-cell RNA sequencing provides a way to obtain marker genes of different cells, which lays the foundation for discovering new cell types. The general strategy of achieving this goal is to build a clustering pipeline and derive differentially expressed genes, followed by the cell type enrichment analysis and driving force analysis. Throughout the entire analysis process, clustering models and appropriate methods of dimension reduction are two vital and challenging tasks. In this study, we present a novel method LAK (a computational pipeline for single-cell RNA-seq data clustering analysis using Lasso and K-means based feature selection method) that can be applied to single-cell RNA-seq data by selecting the candidate genes. To deal with the sparse high-dimensional data, we integrated Lasso penalty into clustering method for single-cell RNA-seq data as the feature selection method, which extracts out the genes that have an actual effect on clustering. We also improved the parameter selection algorithm to search the appropriate parameters automatically by binary search according to the size of the data. Compared with other computational approaches, LAK obtains a better performance in reliability, stability, convenience and accuracy applied to the real datasets, the simulation data, and the datasets with a large number of dropout events.

Author supplied keywords

Cite

CITATION STYLE

APA

Hua, J., Liu, H., Zhang, B., & Jin, S. (2020). LAK: Lasso and K-Means Based Single-Cell RNA-Seq Data Clustering Analysis. IEEE Access, 8, 129679–129688. https://doi.org/10.1109/ACCESS.2020.3008681

LAK: Lasso and K-Means Based Single-Cell RNA-Seq Data Clustering Analysis

Abstract

Author supplied keywords

Cite

Register to see more suggestions