Recently, multimodal processing for various text classification tasks such as emotion recognition, sentiment analysis, author profiling etc. have gained traction due to its potential to improve performance by leveraging complementary sources of information such as texts, images and speech. In this line, we focus on multimodal malware text classification. However unlike traditional tasks such as emotions recognition and sentiment analysis generating complementary domain information is difficult in cyber security, leading to little focus of such a multimodal idea in context of malware classification. As such, in this work we propose to address this gap by improving malware text classification task by leveraging Quick Response (QR) codes generated from the same as complementary information. With superior capacity of Convolutional Neural Network's (CNN) to process images, we fuse the representations from CNN's for both text and image data in multiple ways, where we show that using complementary information from QR codes improves the performance of the task of malware text classification thereby achieving new state-of-the-art and creating the very first multimodal benchmark on malware text classification.
CITATION STYLE
Ravikiran, M., & Madgula, K. (2019). Fusing deep quick response code representations improves malware text classification. In WCRML 2019 - Proceedings of the ACM Workshop on Crossmodal Learning and Application (pp. 11–18). Association for Computing Machinery, Inc. https://doi.org/10.1145/3326459.3329166
Mendeley helps you to discover research relevant for your work.