Data Stream Clustering for Real-Time Anomaly Detection: An Application to Insider Threats

Diana Haidar; Mohamed Medhat Gaber

Book Chapter

Data Stream Clustering for Real-Time Anomaly Detection: An Application to Insider Threats

Haidar D
Gaber M

DOI: 10.1007/978-3-319-97864-2_6

N/ACitations

19Readers

Get full text

Abstract

Insider threat detection is an emergent concern for academia, industries, and governments due to the growing number of insider incidents in recent years. The continuous streaming of unbounded data coming from various sources in an organisation, typically in a high velocity, leads to a typical Big Data computational problem. The malicious insider threat refers to anomalous behaviour(s) (outliers) that deviate from the normal baseline of a data stream. The absence of previously logged activities executed by users shapes the insider threat detection mechanism into an unsupervised anomaly detection approach over a data stream. A common shortcoming in the existing data mining approaches to detect insider threats is the high number of false alarms/positives (FPs). To handle the big data issue and to address the shortcoming, we propose a streaming anomaly detection approach, namely Ensemble of Random subspace Anomaly detectors In Data Streams (E-RAIDS), for insider threat detection. E-RAIDS learns an ensemble of p established outlier detection techniques (Micro-cluster-based Continuous Outlier Detection-MCOD-or Anytime Outlier Detection-AnyOut-) which employ clustering over continuous data streams. Each model of the p models learns from a random feature sub-space to detect local outliers, which might not be detected over the whole feature space. E-RAIDS introduces an aggregate component that combines the results from the p feature subspaces, in order to confirm whether to generate an alarm at each window iteration. The merit of E-RAIDS is that it defines a survival factor and a vote factor to address the shortcoming of high number of FPs. Experiments on E-RAIDS-MCOD and E-RAIDS-AnyOut are carried out, on synthetic data sets including malicious insider threat scenarios generated at Carnegie Mellon University, to test the effectiveness of voting feature subspaces, and the capability to detect (more than one)-behaviour-all-threat in real-time. The results show that E-RAIDS

Cite

CITATION STYLE

APA

Haidar, D., & Gaber, M. M. (2019). Data Stream Clustering for Real-Time Anomaly Detection: An Application to Insider Threats (pp. 115–144). https://doi.org/10.1007/978-3-319-97864-2_6

Data Stream Clustering for Real-Time Anomaly Detection: An Application to Insider Threats

Abstract

Cite

Register to see more suggestions