Isolation-based feature selection for unsupervised outlier detection

16Citations
Citations of this article
22Readers
Mendeley users who have this article in their library.

Abstract

For high-dimensional datasets, redundant features and complex interactions between features may increase computational costs and make outlier detection algorithms inefficient. Most feature selection methods are designed for supervised classification and regression. However, limited works have been conducted specifically for unsupervised outlier detection. This paper proposes a novel isolation-based feature selection (IBFS) method for unsupervised outlier detection. It is based on the training process of isolation forest (IFOR). The effectiveness of the proposed methodology is demonstrated on a simulation dataset and benchmarked against variance, Laplacian score and kurtosis. The evaluation results confirm that IBFS is immune to the effects of feature scaling. The performance of the proposed methodology is benchmarked using one-class support vector machine (OCSVM), IFOR and local outlier factor (LOF) on several real-world datasets. The evaluation results demonstrate that the proposed method can improve the performance of IFOR. The performance of IBFS is similar to and even better than the well-known outlier indicator: Kurtosis, and better than variance and Laplacian score. Additionally, IBFS can produce good performance using a few high-score features, while other feature selection methods need more features.

Cite

CITATION STYLE

APA

Yang, Q., Singh, J., & Lee, J. (2019). Isolation-based feature selection for unsupervised outlier detection. In Proceedings of the Annual Conference of the Prognostics and Health Management Society, PHM (Vol. 11). Prognostics and Health Management Society. https://doi.org/10.36001/phmconf.2019.v11i1.824

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free