Identifying multiple authors in a binary program

26Citations
Citations of this article
37Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Knowing the authors of a binary program has significant application to forensics of malicious software (malware), software supply chain risk management, and software plagiarism detection. Existing techniques assume that a binary is written by a single author, which does not hold true in real world because most modern software, including malware, often contains code from multiple authors. In this paper, we make the first step toward identifying multiple authors in a binary. We present new fine-grained techniques to address the tougher problem of determining the author of each basic block. The decision of attributing authors at the basic block level is based on an empirical study of three large open source software, in which we find that a large fraction of basic blocks can be well attributed to a single author. We present new code features that capture programming style at the basic block level, our approach for identifying external template library code, and a new approach to capture correlations between the authors of basic blocks in a binary. Our experiments show strong evidence that programming styles can be recovered at the basic block level and it is practical to identify multiple authors in a binary.

Cite

CITATION STYLE

APA

Meng, X., Miller, B. P., & Jun, K. S. (2017). Identifying multiple authors in a binary program. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10493 LNCS, pp. 286–304). Springer Verlag. https://doi.org/10.1007/978-3-319-66399-9_16

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free