Analyzing performance of apache spark mllib with multinode clusters on azure hdinsight: Spark-perf case study

0Citations
Citations of this article
3Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In this paper, we present an analysis and results of experimental research into determining the performance of solving machine learning problems via the library Apache Spark MLlib for the ecosystem Microsoft Azure HDInsight with the help of the test dataset Spark-Pref. In order to solve the defined problems, software and information support methodology have been developed based on the monitoring system SparkMeasure and Ambari. Metrics have been suggested for analyzing the performance of Apache Spark computations. These metrics use statistical characteristics of learning and testing processes when benchmark Spark-perf tests are carried out. There have been suggested formulas for determining settings for Apache Spark parameters. These formulas provide a time minimization as compared to the standard values of Spark parameter settings for executing sets of machine learning test tasks for heterogeneous and homogeneous cluster configurations of Apache Spark Azure HDInsight. In order to assess computing performance for machine learning methods in Spark-Pref a metric has been proposed, which is calculated as the ratio of the average testing time and the average training time. The results of the computational experiments have been demonstrated. They confirm the effectiveness of the proposed algorithms for Apache Spark settings relative to the standard values for heterogeneous and homogeneous clusters deployed on the platform Apache Spark Azure HDInsight, machine learning methods for a Spark-Pref test set being implemented.

Cite

CITATION STYLE

APA

Minukhin, S., Brynza, N., & Sitnikov, D. (2021). Analyzing performance of apache spark mllib with multinode clusters on azure hdinsight: Spark-perf case study. In Advances in Intelligent Systems and Computing (Vol. 1246 AISC, pp. 114–134). Springer. https://doi.org/10.1007/978-3-030-54215-3_8

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free