MAP-REDUCE BASED DISTANCE WEIGHTED K-NEAREST NEIGHBOR MACHINE LEARNING ALGORITHM FOR BIG DATA APPLICATIONS

E. Gothai; V. Muthukumaran; K. Valarmathi; V. E. Sathishkumar; N. Thillaiarasu; P. Karthikeyan

Journal ArticleOPEN ACCESS

MAP-REDUCE BASED DISTANCE WEIGHTED K-NEAREST NEIGHBOR MACHINE LEARNING ALGORITHM FOR BIG DATA APPLICATIONS

Scalable Computing (2022) 23(4) 129-145

DOI: 10.12694/scpe.v23i4.1987

12Citations

28Readers

Abstract

With the evolution of Internet standards and advancements in various Internet and mobile technologies, especially since web 4.0, more and more web and mobile applications emerge such as e-commerce, social networks, online gaming applications and Internet of Things based applications. Due to the deployment and concurrent access of these applications on the Internet and mobile devices, the amount of data and the kind of data generated increases exponentially and the new era of Big Data has come into existence. Presently available data structures and data analyzing algorithms are not capable to handle such Big Data. Hence, there is a need for scalable, flexible, parallel and intelligent data analyzing algorithms to handle and analyze the complex massive data. In this article, we have proposed a novel distributed supervised machine learning algorithm based on the MapReduce programming model and Distance Weighted k-Nearest Neighbor algorithm called MR-DWkNN to process and analyze the Big Data in the Hadoop cluster environment. The proposed distributed algorithm is based on supervised learning performs both regression tasks as well as classification tasks on large-volume of Big Data applications. Three performance metrics, such as Root Mean Squared Error (RMSE), Determination coefficient (R2) for regression task, and Accuracy for classification tasks are utilized for the performance measure of the proposed MR-DWkNN algorithm. The extensive experimental results shows that there is an average increase of 3% to 4.5% prediction and classification performances as compared to standard distributed k-NN algorithm and a considerable decrease of Root Mean Squared Error (RMSE) with good parallelism characteristics of scalability and speedup thus, proves its effectiveness in Big Data predictive and classification applications

Author supplied keywords

References Powered by Scopus

View more at Scopus

Cited by Powered by Scopus

View more at Scopus

Cite

CITATION STYLE

APA

Gothai, E., Muthukumaran, V., Valarmathi, K., Sathishkumar, V. E., Thillaiarasu, N., & Karthikeyan, P. (2022). MAP-REDUCE BASED DISTANCE WEIGHTED K-NEAREST NEIGHBOR MACHINE LEARNING ALGORITHM FOR BIG DATA APPLICATIONS. Scalable Computing, 23(4), 129–145. https://doi.org/10.12694/scpe.v23i4.1987

Readers' Seniority

PhD / Post grad / Masters / Doc 4

50%

Professor / Associate Prof. 2

25%

Lecturer / Post doc 1

13%

Researcher 1

13%

Readers' Discipline

Computer Science 4

50%

Business, Management and Accounting 2

25%

Engineering 2

25%

Article Metrics

Social Media

Shares, Likes & Comments: 1

View details >

MAP-REDUCE BASED DISTANCE WEIGHTED K-NEAREST NEIGHBOR MACHINE LEARNING ALGORITHM FOR BIG DATA APPLICATIONS

Abstract

Author supplied keywords

References Powered by Scopus

Nearest Neighbor Pattern Classification

A Fuzzy K-Nearest Neighbor Algorithm

Distributed GraphLab: A framework for machine learning and data mining in the cloud

Cited by Powered by Scopus

REVISITING DISTANCE METRICS IN k-NEAREST NEIGHBORS ALGORITHMS Implications for Sovereign Country Credit Rating Assessments

Design and Development of Computational Methodologies for Predicting Parkinson's Disease with Artificial Intelligence

A new approach to K-nearest neighbors distance metrics on sovereign country credit rating

Register to see more suggestions

Cite

Readers' Seniority

Readers' Discipline

Article Metrics