Detecting Privacy-Sensitive Code Changes with Language Modeling

Gokalp Demirci; Vijayaraghavan Murali; Imad Ahmad; Rajeev Rao; Gareth Ari Aye

Conference Proceedings

Detecting Privacy-Sensitive Code Changes with Language Modeling

Proceedings - 2022 Mining Software Repositories Conference, MSR 2022 (2022) 762-763

DOI: 10.1145/3524842.3528518

1Citations

10Readers

Get full text

Abstract

At Meta, we work to incorporate privacy-by-design into all of our products and keep user information secure. We have created an ML model that detects code changes ('diffs') that have privacy-sensitive implications. At our scale of tens of thousands of engineers creating hundreds of thousands of diffs each month, we use automated tools for detecting such diffs. Inspired by recent studies on detecting defects [2], [3], [5] and security vulnerabilities [4], [6], [7], we use techniques from natural language processing to build a deep learning system for detecting privacy-sensitive code.

Author supplied keywords

Cite

CITATION STYLE

APA

Demirci, G., Murali, V., Ahmad, I., Rao, R., & Aye, G. A. (2022). Detecting Privacy-Sensitive Code Changes with Language Modeling. In Proceedings - 2022 Mining Software Repositories Conference, MSR 2022 (pp. 762–763). Institute of Electrical and Electronics Engineers Inc. https://doi.org/10.1145/3524842.3528518

Detecting Privacy-Sensitive Code Changes with Language Modeling

Abstract

Author supplied keywords

Cite

Register to see more suggestions