Scatter plot is a useful method for visualising clusters and outliers in continuous data. However, this method cannot be used directly on nominal data due to a lack of natural ordering and ‘distance’ in nominal values. One solution to this problem is to map the multi-dimensional nominal data to a numeric space, and then draw a scatter plot of the data points based on the first two principal components of the numeric space. This paper reports a study on how such plots can be generated using three types of mapping: (a) Binary Input Mapping (BImap), (b) Attribute Value Frequency Mapping (AVFmap), and (c) BImap combined with AVFmap. Results show that the combined method draws upon the complementary strengths of BImap and AVFmap, to generate meaningful scatter plots for visualising categorical outliers and achieve the highest information gain among the methods tested.
CITATION STYLE
Tan, S. C. (2014). Visualising Outliers in Nominal Data. In Springer Proceedings in Complexity (pp. 347–358). Springer. https://doi.org/10.1007/978-94-007-7287-8_28
Mendeley helps you to discover research relevant for your work.