A MapReduce-based distributed SVM for scalable data type classification

Chong Jiang; Ting Wu; Jian Xu; Ning Zheng; Ming Xu; Tao Yang

Book Chapter

A MapReduce-based distributed SVM for scalable data type classification

Springer Verlag, (2017), 115-126

DOI: 10.1007/978-3-319-59288-6_11

0Citations

3Readers

Get full text

Abstract

Data type classification is a significant problem in digital forensics and information security field. Methods based on support vector machine have proven the most successful across varying classification approaches in the previous work. However, the training process of SVM is notably computationally intensive with the number of training vectors increased rapidly. In this study, we proposed parallel distributed SVM (PDSVM) based on Hadoop MapReduce for scalable data type classification. First the map phase determines support vectors (SVs) in the splits of dataset by running the sequential minimal optimization. Then the reduce phase merges SVs and computes the degree of global convergence. Finally, PDSVM utilizes the global convergence SVs to get SVM model. The experimental results demonstrate that PDSVM can not only process large scale training dataset, but also perform well in the term of classification accuracy.

Author supplied keywords

Cite

CITATION STYLE

APA

Jiang, C., Wu, T., Xu, J., Zheng, N., Xu, M., & Yang, T. (2017). A MapReduce-based distributed SVM for scalable data type classification. In Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering, LNICST (Vol. 201, pp. 115–126). Springer Verlag. https://doi.org/10.1007/978-3-319-59288-6_11

A MapReduce-based distributed SVM for scalable data type classification

Abstract

Author supplied keywords

Cite

Register to see more suggestions