Impact of dataset representation on smartphone malware detection performance

4Citations
Citations of this article
9Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Improving Smartphone anomaly-based malware detection techniques is widely studied in recent years. Previous studies explore three factors: dataset size, dataset type and normal profile model. These factors improve the performance, but increase computation complexity and the required memory space. In this paper we explore a new factor: the dataset representation. Dataset representation is the format adopted to organize and represent data. To investigate the impact of this factor, we examine four machine learning classifiers with three different dataset representations. Those dataset representations are: successive system calls, bag of system calls and patterns frequency system calls. The used dataset is a collection of system call traces of Smartphone executing Android 2.2. We analyse the performance of each classifier and deduce the influence of dataset representation on accuracy and false positive rates. The results show that the dataset representation has a potential impact on the performance of classifiers with low computational and memory cost.

Cite

CITATION STYLE

APA

Amamra, A., Talhi, C., & Robert, J. M. (2013). Impact of dataset representation on smartphone malware detection performance. In IFIP Advances in Information and Communication Technology (Vol. 401, pp. 166–176). Springer New York LLC. https://doi.org/10.1007/978-3-642-38323-6_12

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free