Anomaly detection is a commonly used approach for constructing intrusion detection systems. A key requirement is that the data used for building the resource profile are indeed attack-free, but this issue is often skipped or taken for granted. In this work we consider the problem of corruption in the learning data, with respect to a specific detection system, i.e., a web site integrity checker. We used corrupted learning sets and observed their impact on performance (in terms of false positives and false negatives). This analysis enabled us to gain important insights into this rather unexplored issue. Based on this analysis we also present a procedure for detecting whether a learning set is corrupted. We evaluated the performance of our proposal and obtained very good results up to a corruption rate close to 50%. Our experiments are based on collections of real data and consider three different flavors of anomaly detection. © Springer-Verlag Berlin Heidelberg 2007.
CITATION STYLE
Medvet, E., & Bartoli, A. (2007). On the effects of learning set corruption in anomaly-based detection of web defacements. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4579 LNCS, pp. 60–78). Springer Verlag. https://doi.org/10.1007/978-3-540-73614-1_4
Mendeley helps you to discover research relevant for your work.