Boosting training for PDF malware classifier via active learning

Xinxin Wang; Yuanzhang Li; Quanxin Zhang; Xiaohui Kuang

Conference Proceedings

Boosting training for PDF malware classifier via active learning

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2019) 11983 LNCS 101-110

DOI: 10.1007/978-3-030-37352-8_9

0Citations

3Readers

Get full text

Abstract

Malicious code has been a serious threat in the field of network security. PDF (Portable Document Format) is a widely used file format, and often utilized as a vehicle for malicious behavior. In this paper, machine learning algorithm will be used to detect malicious PDF document, and evaluated on experimental data. The main work of this paper is to implement a malware detection method, which utilizes static pre-processing and machine learning algorithm for classification. During the period of classifying, the differences in structure and content between malicious and benign PDF files will be taken as the classification basis. What’s more, we boost training for the PDF malware classifier via active learning based on mutual agreement analysis. The detector is retrained according to the truth value of the uncertain samples, which can not only reduce the training time consumption of the detector, but also improve the detection performance.

Author supplied keywords

Cite

CITATION STYLE

APA

Wang, X., Li, Y., Zhang, Q., & Kuang, X. (2019). Boosting training for PDF malware classifier via active learning. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11983 LNCS, pp. 101–110). Springer. https://doi.org/10.1007/978-3-030-37352-8_9

Boosting training for PDF malware classifier via active learning

Abstract

Author supplied keywords

Cite

Register to see more suggestions