Abstract
Industry reports and blogs have estimated the amount of malware based on known malicious files. This paper extends this analysis to the amount of unknown malware. The study is based on 26.7 million files referenced in telemetry reports from 50 million computers running commercial anti-malware (AM) products. To estimate the undetected malware, a classifier predicts the underlying nature of unknown files recorded in the telemetry reports. The telemetry classifier predicts that 69.6% (4.27 million) of the unknown files are malicious. Assuming the unknown files predicted to be malicious by the classifier are malware, the telemetry classifier also allows us to estimate the efficacy of the AM system indicating that signatures detected 82.8% (20.6 million) of the malicious files. We have validated our system by conducting a longitudinal study to measure the false positive and false negative rates over a period of thirteen months. © 2012 Springer-Verlag.
Author supplied keywords
Cite
CITATION STYLE
Stokes, J. W., Platt, J. C., Wang, H. J., Faulhaber, J., Keller, J., Marinescu, M., … Gheorghescu, M. (2012). Scalable telemetry classification for automated malware detection. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7459 LNCS, pp. 788–805). https://doi.org/10.1007/978-3-642-33167-1_45
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.