A scheduling optimization technique based on reuse in spark to defend against APT attack

15Citations
Citations of this article
18Readers
Mendeley users who have this article in their library.

Abstract

Advanced Persistent Threat (APT) attack, an attack option in recent years, poses serious threats to the security of governments and enterprises data due to its advanced and persistent attacking characteristics. To address this issue, a security policy of big data analysis has been proposed based on the analysis of log data of servers and terminals in Spark. However, in practical applications, Spark cannot suitably analyze very huge amounts of log data. To address this problem, we propose a scheduling optimization technique based on the reuse of datasets to improve Spark performance. In this technique, we define and formulate the reuse degree of Directed Acyclic Graphs (DAGs) in Spark based on Resilient Distributed Datasets (RDDs). Then, we define a global optimization function to obtain the optimal DAG sequence, that is, the sequence with the least execution time. To implement the global optimization function, we further propose a novel cost optimization algorithm based on the traditional Genetic Algorithm (GA). Our experiments demonstrate that this scheduling optimization technique in Spark can greatly decrease the time overhead of analyzing log data for detecting APT attacks.

Cite

CITATION STYLE

APA

Tang, J., Xu, M., Fu, S., & Huang, K. (2018). A scheduling optimization technique based on reuse in spark to defend against APT attack. Tsinghua Science and Technology, 23(5), 550–560. https://doi.org/10.26599/TST.2018.9010022

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free