Research in anomaly detection suffers from a lack of realistic and publicly-available data sets. Because of this, most published experiments in anomaly detection validate their algorithms with application-specific case studies or benchmark datasets of the researchers' construction. This makes it difficult to compare different methods or to measure progress in the field. It also limits our ability to understand the factors that determine the performance of anomaly detection algorithms. This article proposes a new methodology for empirical analysis and evaluation of anomaly detection algorithms. It is based on generating thousands of benchmark datasets by transforming existing supervised learning benchmark datasets and manipulating properties relevant to anomaly detection. The paper identifies and validates four important dimensions: (a) point difficulty, (b) relative frequency of anomalies, (c) clusteredness of anomalies, and (d) relevance of features. We apply our generated datasets to analyze several leading anomaly detection algorithms. The evaluation verifies the importance of these dimensions and shows that, while some algorithms are clearly superior to others, anomaly detection accuracy is determined more by variation in the four dimensions than by the choice of algorithm.
Mendeley saves you time finding and organizing research
Choose a citation style from the tabs below