Hi-LASSO: High-performance python and apache spark packages for feature selection with high-dimensional data

6Citations
Citations of this article
12Readers
Mendeley users who have this article in their library.
Get full text

Abstract

High-dimensional LASSO (Hi-LASSO) is a powerful feature selection tool for high-dimensional data. Our previous study showed that Hi-LASSO outperformed the other state-of-the-art LASSO methods. However, the substantial cost of bootstrapping and the lack of experiments for a parametric statistical test for feature selection have impeded to apply Hi-LASSO for practical applications. In this paper, the Python package and its Spark library are efficiently designed in a parallel manner for practice with real-world problems, as well as providing the capability of the parametric statistical tests for feature selection on high-dimensional data. We demonstrate Hi-LASSO's outperformance with various intensive experiments in a practical manner. Hi-LASSO will be efficiently and easily performed by using the packages for feature selection. Hi-LASSO packages are publicly available at https://github.com/dataxlab/Hi-LASSO under the MIT license. The packages can be easily installed by Python PIP, and additional documentation is available at https://pypi.org/project/hi-lasso and https://pypi.org/project/Hi-LASSO-spark.

Cite

CITATION STYLE

APA

Jo, J., Jung, S., Park, J., Kim, Y., & Kang, M. (2022). Hi-LASSO: High-performance python and apache spark packages for feature selection with high-dimensional data. PLoS ONE, 17(12 December). https://doi.org/10.1371/journal.pone.0278570

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free