HEBCS: A High-Efficiency Binary Code Search Method

2Citations
Citations of this article
6Readers
Mendeley users who have this article in their library.

Abstract

Binary code search is a technique that involves finding code with similarity to a given code within a code database. It finds extensive application in scenarios such as vulnerability queries and code defect analysis. While many existing methods employ advanced machine learning models for similarity analysis, their lack of interpretability and low efficiency in dealing with large-scale functions still remain challenges. To address these issues, we propose a high-efficiency binary code search method called HEBCS. It employs an interpretable approach to extract function-level features and transforms each feature into a locality-sensitive hash representation. Then, the hashes of these features are combined to form the hash of the function. By leveraging the pigeonhole principle, HEBCS enables efficient storage and retrieval of functions, ensuring high execution efficiency even in the presence of large-scale data. Furthermore, we compare HEBCS with a classic method and a state-of-the-art method, demonstrating that HEBCS achieves significantly higher search efficiency while maintaining a comparable accuracy, recall and F1-score. In real-world vulnerability query applications, HEBCS demonstrated promising results. Its effectiveness in large-scale binary function searches suggests significant potential for practical applications.

Cite

CITATION STYLE

APA

Sun, X., Wei, Q., Du, J., & Wang, Y. (2023). HEBCS: A High-Efficiency Binary Code Search Method. Electronics (Switzerland), 12(16). https://doi.org/10.3390/electronics12163464

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free