Learning from others: User anomaly detection using anomalous samples from other users

Youngja Park; Ian M. Molloy; Suresh N. Chari; Zenglin Xu; Chris Gates; Ninghi Li

Conference ProceedingsOPEN ACCESS

Learning from others: User anomaly detection using anomalous samples from other users

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2015) 9327 396-414

DOI: 10.1007/978-3-319-24177-7_20

11Citations

41Readers

Abstract

Machine learning is increasingly used as a key technique in solving many security problems such as botnet detection, transactional fraud, insider threat, etc. One of the key challenges to the widespread application of ML in security is the lack of labeled samples from real applications. For known or common attacks, labeled samples are available, and, therefore, supervised techniques such as multi-class classification can be used. However, in many security applications, it is difficult to obtain labeled samples as each attack can be unique. In order to detect novel, unseen attacks, researchers used unsupervised outlier detection or one-class classification approaches, where they treat existing samples as benign samples. These methods, however, yield high false positive rates, preventing their adoption in real applications. This paper presents a local outlier factor (LOF)-based method to automatically generate both benign and malicious training samples from unlabeled data. Our method is designed for applications with multiple users such as insider threat, fraud detection, and social network analysis. For each target user, we compute LOF scores of all samples with respect to the target user’s samples. This allows us to identify (1) other users’ samples that lie in the boundary regions and (2) outliers from the target user’s samples that can distort the decision boundary. We use the samples from other users as malicious samples, and use the target user’s samples as benign samples after removing the outliers. We validate the effectiveness of our method using several datasets including access logs for valuable corporate resources, DBLP paper titles, and behavioral biometrics of user typing behavior. The evaluation of our method on these datasets confirms that, in almost all cases, our technique performs significantly better than both one-class classification methods and prior two-class classification methods. Further, our method is a general technique that can be used for many security applications.

Cite

CITATION STYLE

APA

Park, Y., Molloy, I. M., Chari, S. N., Xu, Z., Gates, C., & Li, N. (2015). Learning from others: User anomaly detection using anomalous samples from other users. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9327, pp. 396–414). Springer Verlag. https://doi.org/10.1007/978-3-319-24177-7_20

Learning from others: User anomaly detection using anomalous samples from other users

Abstract

Cite

Register to see more suggestions