Bad Snakes: Understanding and Improving Python Package Index Malware Scanning

Duc Ly Vu; Zachary Newman; John Speed Meyers

Conference ProceedingsOPEN ACCESS

Bad Snakes: Understanding and Improving Python Package Index Malware Scanning

Proceedings - International Conference on Software Engineering (2023) 499-511

DOI: 10.1109/ICSE48619.2023.00052

4Citations

16Readers

Abstract

Open-source, community-driven package repositories see thousands of malware packages each year, but do not currently run automated malware detection systems. In this work, we explore the security goals of the repository administrators and the requirements for deploying such malware scanners via a case study of the Python ecosystem and PyPI repository, including interviews with administrators and maintainers. Further, we evaluate existing malware detection techniques for deployment in this setting by creating a benchmark dataset and comparing several existing tools: the malware checks implemented in PyPI, Bandit4Mal, and OSSGadget's OSS Detect Backdoor. We find that repository administrators have exacting requirements for such malware detection tools. Specifically, they consider a false positive rate of even 0.1% to be unacceptably high, given the large number of package releases that might trigger false alerts. Measured tools have false positive rates between 15% and 97%; increasing thresholds for detection rules to reduce this rate renders the true positive rate useless. While automated tools are far from reaching these demands, we find that a socio-technical malware detection system has emerged to meet these needs: external security researchers perform repository malware scans, filter for useful results, and report the results to repository administrators. These parties face different incentives and constraints on their time and tooling. We conclude with recommendations for improving detection capabilities and strengthening the collaboration between security researchers and software repository administrators.

Author supplied keywords

Cite

CITATION STYLE

APA

Vu, D. L., Newman, Z., & Meyers, J. S. (2023). Bad Snakes: Understanding and Improving Python Package Index Malware Scanning. In Proceedings - International Conference on Software Engineering (pp. 499–511). IEEE Computer Society. https://doi.org/10.1109/ICSE48619.2023.00052

Bad Snakes: Understanding and Improving Python Package Index Malware Scanning

Abstract

Author supplied keywords

Cite

Register to see more suggestions