Several malware analysis techniques suppose that the disassembled code of a piece of malware is available, which is however not always possible. This paper proposes a flexible and automated approach to extract malware behaviour by observing all the system function calls performed in a virtualized execution environment. Similarities and distances between malware behaviours are computed which allows to classify malware behaviours. The main features of our approach reside in coupling a sequence alignment method to compute similarities and leverage the Hellinger distance to compute associated distances. We also show how the accuracy of the classification process can be improved using a phylogenetic tree. Such a tree shows common functionalities and evolution of malware. This is relevant when dealing with obfuscated malware variants that have often similar behaviour. The phylogenetic trees were assessed using known antivirus results and only a few malware behaviours were wrongly classified. © 2007 Springer-Verlag France.
Mendeley saves you time finding and organizing research
Choose a citation style from the tabs below