Towards efficient detection of malicious VBA macros with LSI

16Citations
Citations of this article
18Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Targeted email attacks are one of main threats for organizations of all sizes and across every field. In targeted email attacks, malicious VBA (Visual Basic for Applications) macros are often contained in the attachment files to exploit the target computers. These malicious VBA macros are obfuscated in several ways to evade detection. Hence, pattern-based detection has a limitation in detecting these new malicious VBA macros. To detect new malicious VBA macros, some methods with machine learning techniques have been proposed. A method extracts words from the source code, and constructs a language model to represent VBA macros for machine learning techniques. This method, however, constructs a language model from all the extracted words. Therefore, this model might contain unnecessary words to classify. To construct an efficient language model, we focus on LSI (Latent Semantic Indexing). LSI is one of the foundational techniques in topic modeling, and calculates similarity of documents. Our method uses LSI to construct an efficient language model, which produces more accuracy and efficiency. To the best of our knowledge, our method is the first method to detect new malicious VBA macros with LSI. Our method extracts words from the source code and converts into feature vectors with some Natural Language Processing techniques. Our method trains a classifier with benign and malicious VBA macros and detects new malicious VBA macros. Several thousands of samples for evaluation are obtained from Virus Total. The experimental result shows that our method can detect new malicious VBA macros more accurately and efficiently. The best F-measure achieves 0.95.

Author supplied keywords

Cite

CITATION STYLE

APA

Mimura, M., & Ohminami, T. (2019). Towards efficient detection of malicious VBA macros with LSI. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11689 LNCS, pp. 168–185). Springer Verlag. https://doi.org/10.1007/978-3-030-26834-3_10

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free