This chapter addresses one of the research issues connected with the outlier detection problem, namely dimensionality of the data. More specifically, the focus is on detecting outliers embedded in subspaces of high dimensional categorical data. To this effect, some algorithms for unsupervised selection of feature subsets in categorical data domain are furnished here. A detailed discussion on devising necessary measures for assessing the relevance and redundancy of categorical attributes/features is presented. Experimental study of these algorithms on benchmark categorical data sets explores the efficacy of these algorithms towards outlier detection.
CITATION STYLE
Ranga Suri, N. N. R., Murty M., N., & Athithan, G. (2019). Outliers in high dimensional data. In Intelligent Systems Reference Library (Vol. 155, pp. 95–111). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-05127-3_6
Mendeley helps you to discover research relevant for your work.