Binary Code Similarity Detection through LSTM and Siamese Neural Network

  • Luo Z
  • Hou T
  • Zhou X
  • et al.
N/ACitations
Citations of this article
8Readers
Mendeley users who have this article in their library.

Abstract

Given the fact that many software projects are closed-source, analyzing security-related vulnerabilities at the binary level is quintessential to protect computer systems from attacks of malware. Binary code similarity detection is a potential solution for detecting malware from the binaries generated by the processor. In this paper, we proposed a malware detection mechanism based on the binaries using machine learning techniques. Through utilizing the Recurrent Neural Network (RNN), more specifically Long Short-Term Memory (LSTM) network, we generate the uniformed feature embedding of each binary file and further take advantage of the Siamese Neural Network to compute the similarity measure of the extracted features. Therefore, the security risks of the software projects can be evaluated through the similarity measure of the corresponding binaries with existing trained malware. Our real-world experimental results demonstrate a convincing performance in distinguishing out the outliers, and achieved slightly better performance compared with existing state-of-the-art methods.

Cite

CITATION STYLE

APA

Luo, Z., Hou, T., Zhou, X., Zeng, H., & Lu, Z. (2021). Binary Code Similarity Detection through LSTM and Siamese Neural Network. ICST Transactions on Security and Safety, 8(29), 170956. https://doi.org/10.4108/eai.14-9-2021.170956

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free