Hybrid subspace mixture models for prediction and anomaly detection in high dimensions

0Citations
Citations of this article
3Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Robust learning of mixture models in high dimensions remains an open challenge and especially so in current big data era. This paper investigates twelve variants of hybrid mixture models that combine the G-means clustering, Gaussian, and Student t-distribution mixture models for high-dimensional predictive modeling and anomaly detection. High-dimensional data is first reduced to lower-dimensional subspace using whitened principal component analysis. For real-time data processing in batch mode, a technique based on Gram-Schmidt orthogonalization process is proposed and demonstrated to update the reduced dimensions to remain relevant in fulfilling the task objectives. In addition, a model-adaptation technique is proposed and demonstrated for big data incremental learning by statistically matching the mixture components’ mean and variance vectors; the adapted parameters are computed based on weighted average that takes into account the sample size of new and older statistics with a parameter to scale down the influence of older statistics in each iterative computation. The hybrid models’ performance are evaluated using simulation and empirical studies. Results show that simple hybrid models without the Expectation-Maximization training step can achieve equally high performance in high dimensions that is comparable to the more sophisticated models. For unsupervised anomaly detection, the hybrid models achieve detection rate ≳ 90% with injected anomalies from 1% to 60% using the KDD Cup 1999 network intrusion dataset.

Cite

CITATION STYLE

APA

Ong, J. B., & Ng, W. K. (2017). Hybrid subspace mixture models for prediction and anomaly detection in high dimensions. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10604 LNAI, pp. 326–339). Springer Verlag. https://doi.org/10.1007/978-3-319-69179-4_23

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free