The self-organizing map (SOM) methodology does vector quantization and clustering on the dataset, and then projects the obtained clusters to a lower dimensional space, such as a 2D map, by positioning similar clusters in locations that are spatially closer in the lower dimension space. This makes the SOM methodology an effective tool for data visualization. However, in a world where mined information from big data have to be available immediately, SOM becomes an unattractive tool because of its time complexity. In this paper, we propose an alternative visualization methodology for large datasets that emulates SOM methodology without the speed constraints inherent to SOM. To demonstrate the efficiency and the potential of the proposed scheme as a fast visualization tool, the methodology is used to cluster and project the 3,823 image samples of handwritten digits of the Optical Recognition of Handwritten Digits dataset. Although the dataset is not, by any means large, it is sufficient to demonstrate the speed-up that can be achieved by using this proposed SOM emulation procedure.
Cordel, M. O., & Azcarraga, A. P. (2015). Fast emulation of self-organizing maps for large datasets. In Procedia Computer Science (Vol. 52, pp. 381–388). Elsevier B.V. https://doi.org/10.1016/j.procs.2015.05.002