Fast One-class Classification using Class Boundary-preserving Random Projections

10Citations
Citations of this article
8Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Several applications, like malicious URL detection and web spam detection, require classification on very high-dimensional data. In such cases anomalous data is hard to find but normal data is easily available. As such it is increasingly common to use a one-class classifier (OCC). Unfortunately, most OCC algorithms cannot scale to datasets with extremely high dimensions. In this paper, we present Fast Random projection-based One-Class Classification (FROCC), an extremely efficient, scalable and easily parallelizable method for one-class classification with provable theoretical guarantees. Our method is based on the simple idea of transforming the training data by projecting it onto a set of random unit vectors that are chosen uniformly and independently from the unit sphere, and bounding the regions based on separation of the data. FROCC can be naturally extended with kernels. We provide a new theoretical framework to prove that that FROCC generalizes well in the sense that it is stable and has low bias for some parameter settings. We then develop a fast scalable approximation of FROCC using vectorization, exploiting data sparsity and parallelism to develop a new implementation called ParDFROCC. ParDFROCC achieves up to 2 percent points better ROC than the next best baseline, with up to 12× speedup in training and test times over a range of state-of-the-art benchmarks for the OCC task.

Cite

CITATION STYLE

APA

Bhattacharya, A., Varambally, S., Bagchi, A., & Bedathur, S. (2021). Fast One-class Classification using Class Boundary-preserving Random Projections. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 66–74). Association for Computing Machinery. https://doi.org/10.1145/3447548.3467440

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free