Using Sparse Composite Document Vectors to Classify VBA Macros

Mamoru Mimura

Conference Proceedings

Using Sparse Composite Document Vectors to Classify VBA Macros

Mimura M

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2019) 11928 LNCS 714-720

DOI: 10.1007/978-3-030-36938-5_46

5Citations

4Readers

Get full text

Abstract

To detect new macro malware, NLP-based detection methods have been proposed. These methods mainly use a Doc2vec model to represent the source code, which provides a vector space to classify malicious macros and benign ones. Recently, more sophisticated models outperform Doc2vec in performance and time complexity. However, there is no study to compare these language models for macro malware detection. In this paper, we focus on Sparse Composite Document Vectors (SCDV), which is a simple feature construction algorithm. To evaluate the performance for malware detection, we compare SCDV and other language models: Bag-of Words, Latent Semantic Indexing (LSI), Doc2vec. The experimental result with actual macro malware shows the most suitable language model for macro malware detection.

Author supplied keywords

Cite

CITATION STYLE

APA

Mimura, M. (2019). Using Sparse Composite Document Vectors to Classify VBA Macros. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11928 LNCS, pp. 714–720). Springer. https://doi.org/10.1007/978-3-030-36938-5_46

Using Sparse Composite Document Vectors to Classify VBA Macros

Abstract

Author supplied keywords

Cite

Register to see more suggestions