Outlier detection is an important issue in many industrial and financial applications. Most outlier detection methods suffer from two problems: First, they need parameter tuning in accord to domain knowledge. Second, they are incapable to scale up to high dimensional space. In this paper, we propose a distance-based outlier definition and a detection algorithm ODDC (Distribution Clustering Outlier Detection). We redefine the problem by clustering in the distribution difference space rather than the original feature space. As a result, the new algorithm is stable regardless of different input and scalable to the dimensionality. Experiments on both synthetic and real datasets show that ODDC outperforms the counterpart both in effectiveness and efficiency.
CITATION STYLE
Niu, K., Huang, C., Zhang, S., & Chen, J. (2007). Emerging Technologies in Knowledge Discovery and Data Mining. Emerging Technologies in Knowledge Discovery and Data Mining (Vol. 4819, pp. 332–343). Retrieved from http://www.springerlink.com/index/10.1007/978-3-540-77018-3
Mendeley helps you to discover research relevant for your work.