Comparative analysis of machine learning techniques to Identify churn for telecom data

2Citations
Citations of this article
18Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Big data analytics has been the focus for large scale data processing. Machine learning and Big data has great future in prediction. Churn prediction is one of the sub domain of big data. Preventing customer attrition especially in telecom is the advantage of churn prediction. Churn prediction is a day-to-day affair involving millions. So a solution to prevent customer attrition can save a lot. This paper propose to do comparison of three machine learning techniques Decision tree algorithm, Random Forest algorithm and Gradient Boosted tree algorithm using Apache Spark. Apache Spark is a data processing engine used in big data which provides inmemory processing so that the processing speed is higher. The analysis is made by extracting the features of the data set and training the model. Scala is a programming language that combines both object oriented and functional programming and so a powerful programming language. The analysis is implemented using Apache Spark and modelling is done using scala ML. The accuracy of Decision tree model came out as 86%, Random Forest model is 87% and Gradient Boosted tree is 85%.

Cite

CITATION STYLE

APA

Malleswari, M., Maniraj, R., Kumar, P., & Murugan. (2018). Comparative analysis of machine learning techniques to Identify churn for telecom data. International Journal of Engineering and Technology(UAE), 7(3.34 Special Issue  34), 291–295. https://doi.org/10.14419/ijet.v7i3.2.14422

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free