Malware classification based on extracted API sequences using static analysis

41Citations
Citations of this article
49Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In this paper, we propose a highly accurate, automatic malware- classification method, which extracts features by conducting static analysis of malware samples and the structure of malware source code. In the proposed extraction method, the presence and absence of particular pairs of consecutive Application Program Interface function calls (APIs) in the API-sequence graph are compared with those in the executable code for a sample within which malware features have been identified. To determine the degree of similarity between samples, Dice's coefficient is applied. To visualize the grouping of samples with similar features, we use hierarchical cluster analysis based on the extracted features. The results of the analysis are presented as a dendrogram with colored nodes for each family name. To evaluate the proposed method, we set up a malware-analysis sys- tem comprising a combination of disassembler, control-flow analyzer, API-sequence extractor, similarity calculator and hierarchical cluster analyzer. We acquired 4,684 malware samples, from 1,821 of which we successfully extracted API sequences to which we applied our proposed classification method. We found that the automatic hierarchical cluster analysis was processed rapidly, with significant clusters of variant groups obtained. Copyright 2012 ACM.

Cite

CITATION STYLE

APA

Iwamoto, K., & Wasaki, K. (2012). Malware classification based on extracted API sequences using static analysis. In Asian Internet Engineeering Conference, AINTEC 2012 (pp. 31–38). https://doi.org/10.1145/2402599.2402604

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free