Malware Classification Using Probability Scoring and Machine Learning

43Citations
Citations of this article
87Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Malware classification plays an important role in tracing the attack sources of computer security. However, existing static analysis methods are fast in classification, but they are inefficient in some malware using packing and obfuscation techniques; the dynamic analysis methods have better universality for packing and obfuscation, but they will cause excessive classification cost. To overcome these shortcomings, in this paper, we propose a classification system Malscore based on the probability scoring and machine learning, which sets the probability threshold to concatenate static analysis (called Phase 1) and dynamic analysis (called Phase 2). The convolutional neural networks with spatial pyramid pooling were used to analyze the grayscale images (static features) in Phase 1, and the variable n-grams and machine learning were used to analyze the native API call sequences (dynamic features) in Phase 2. Malscore combined static analysis with dynamic analysis not only accelerated the static analysis process by taking advantage of the CNN in image recognition but also appeared to be more resilient to obfuscation by the dynamic analysis. Different from other static and dynamic analysis techniques, when malware is detected, due to the fact that malware will most likely be labeled only by static analysis, we could reduce the overheads by dynamically analyzing a few malware that has less obvious features or greater confusion in static analysis. We performed experiments on 174607 malware samples from 63 malware families. The result showed that Malscore achieved 98.82% accuracy for malware classification. Furthermore, Malscore was compared with the method of using static and dynamic analysis. The preprocessing and test time represented a reduction of 59.58% and 61.70%, respectively.

Cite

CITATION STYLE

APA

Xue, D., Li, J., Lv, T., Wu, W., & Wang, J. (2019). Malware Classification Using Probability Scoring and Machine Learning. IEEE Access, 7, 91641–91656. https://doi.org/10.1109/ACCESS.2019.2927552

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free