The paper compares the performance of technologies for big data processing applied for anomaly detection in a database of a billing system. Experiments have been conducted to evaluate the performance for processing big data of two technologies such as Hadoop MapReduce and Apache Spark. The datasets and configuration of the applications used in the experiments are described. The experiment results are presented as a description of the relation between calculation time, data size and imbalance in datasets. Based on the obtained results, a decision making system for choosing a technology for processing data depending on the characteristics of the datasets is proposed.
CITATION STYLE
Teryoshkin, S. E., & Yakovina, I. N. (2020). Research of efficiency of technologies for data processing in tasks of big data analysis. In Journal of Physics: Conference Series (Vol. 1441). Institute of Physics Publishing. https://doi.org/10.1088/1742-6596/1441/1/012049
Mendeley helps you to discover research relevant for your work.