Implementing Scalable Machine Learning Algorithms for Mining Big Data: A State-of-the-Art Survey

2Citations
Citations of this article
9Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The growing trend of Big Data drives additional demand for novel solutions and specifically-designed algorithms that will perform efficient Big Data filtering and processing, recently even in a real-time fashion. Thus, the necessity to scale up Machine Learning algorithms to larger datasets and more complex methods should be addressed by distributed parallelism. This book chapter conducts a thorough literature review on distributed parallel data-intensive Machine Learning algorithms applied on Big Data so far. The selected algorithms fall into various Machine Learning categories, including (i) unsupervised learning, (ii) supervised learning, (iii) semi-supervised learning and (iv) deep learning. The most popular programming frameworks like MapReduce, PLANET, DryadLINQ, IBM Parallel Machine Learning Toolbox (PML), Compute Unified Device Architecture (CUDA) etc., well suited for parallelizing Machine Learning algorithms, will be cited throughout the review. However, this review is mainly focused on the performance and implementation traits of scalable Machine Learning algorithms, rather than on framework wide-ranging choices and their trade-offs.

Cite

CITATION STYLE

APA

Skënduli, M. P., Biba, M., & Ceci, M. (2018). Implementing Scalable Machine Learning Algorithms for Mining Big Data: A State-of-the-Art Survey. In Studies in Big Data (Vol. 44, pp. 65–81). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-981-10-8476-8_4

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free