Unsupervised anomaly detection

David Guthrie; Louise Guthrie; Ben Allison; Yorick Wilks

Conference ProceedingsOPEN ACCESS

Unsupervised anomaly detection

IJCAI International Joint Conference on Artificial Intelligence (2007) 1624-1628

DOI: 10.5121/csit.2024.140210

32Citations

75Readers

Abstract

This paper describes work on the detection of anomalous material in text. We show several variants of an automatic technique for identifying an 'unusual' segment within a document, and consider texts which are unusual because of author, genre [Biber, 1998], topic or emotional tone. We evaluate the technique using many experiments over large document collections, created to contain randomly inserted anomalous segments. In order to successfully identify anomalies in text, we define more than 200 stylistic features to characterize writing, some of which are well-established stylistic determiners, but many of which are novel. Using these features with each of our methods, we examine the effect of segment size on our ability to detect anomaly, allowing segments of size 100 words, 500 words and 1000 words. We show substantial improvements over a baseline in all cases for all methods, and identify the method variant which performs consistently better than others.

Cite

CITATION STYLE

APA

Guthrie, D., Guthrie, L., Allison, B., & Wilks, Y. (2007). Unsupervised anomaly detection. In IJCAI International Joint Conference on Artificial Intelligence (pp. 1624–1628). https://doi.org/10.5121/csit.2024.140210

Unsupervised anomaly detection

Abstract

Cite

Register to see more suggestions