On leveraging coding habits for effective binary authorship attribution

11Citations
Citations of this article
20Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

We propose BinAuthor, a novel and the first compiler-agnostic method for identifying the authors of program binaries. Having filtered out unrelated functions (compiler and library) to detect user-related functions, it converts user-related functions into a canonical form to eliminate compiler/compilation effects. Then, it leverages a set of features based on collections of authors’ choices made during coding. These features capture an author’s coding habits. Our evaluation demonstrated that BinAuthor outperforms existing methods in several respects. First, when tested on large datasets extracted from selected open-source C/C++ projects in GitHub, Google Code Jam events, and Planet Source Code contests, it successfully attributed a larger number of authors with a significantly higher accuracy: around 90 % when the number of authors is 1000. Second, when the code was subjected to refactoring techniques, code transformation, or processing using different compilers or compilation settings, there was no significant drop in accuracy, indicating that BinAuthor is more robust than previous methods.

Cite

CITATION STYLE

APA

Alrabaee, S., Shirani, P., Wang, L., Debbabi, M., & Hanna, A. (2018). On leveraging coding habits for effective binary authorship attribution. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11098 LNCS, pp. 26–47). Springer Verlag. https://doi.org/10.1007/978-3-319-99073-6_2

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free