Fast Support Vector Machine classification of very large datasets

Janis Fehr; Karina Zapién Arreola; Hans Burkhardt

Conference Proceedings

Fast Support Vector Machine classification of very large datasets

Studies in Classification, Data Analysis, and Knowledge Organization (2008) 11-18

DOI: 10.1007/978-3-540-78246-9_2

6Citations

15Readers

Get full text

Abstract

In many classification applications, Support Vector Machines (SVMs) have proven to be highly performing and easy to handle classifiers with very good generalization abilities. However, one drawback of the SVM is its rather high classification complexity which scales linearly with the number of Support Vectors (SVs). This is due to the fact that for the classification of one sample, the kernel function has to be evaluated for all SVs. To speed up classification, different approaches have been published, most which of try to reduce the number of SVs. In our work, which is especially suitable for very large datasets, we follow a different approach: as we showed in (Zapien et al. 2006), it is effectively possible to approximate large SVM problems by decomposing the original problem into linear subproblems, where each subproblem can be evaluated in Ω(1). This approach is especially successful, when the assumption holds that a large classification problem can be split into mainly easy and only a few hard subproblems. On standard benchmark datasets, this approach achieved great speedups while suffering only sightly in terms of classification accuracy and generalization ability. In this contribution, we extend the methods introduced in (Zapien et al. 2006) using not only linear, but also non-linear subproblems for the decomposition of the original problem which further increases the classification performance with only a little loss in terms of speed. An implementation of our method is available in (Ronneberger and et al.) Due to page limitations, we had to move some of theoretic details (e.g. proofs) and extensive experimental results to a technical report (Zapien et al. 2007).

Cite

CITATION STYLE

APA

Fehr, J., Zapién Arreola, K., & Burkhardt, H. (2008). Fast Support Vector Machine classification of very large datasets. In Studies in Classification, Data Analysis, and Knowledge Organization (pp. 11–18). Kluwer Academic Publishers. https://doi.org/10.1007/978-3-540-78246-9_2

Fast Support Vector Machine classification of very large datasets

Abstract

Cite

Register to see more suggestions