Review on analysis of the application areas and algorithms used in data wrangling in big data

Chiranjivi Bashya; Malka N. Halgamuge; Azeem Mohammad

Book Chapter

Review on analysis of the application areas and algorithms used in data wrangling in big data

Springer Science and Business Media Deutschland GmbH, (2018), 337-353

DOI: 10.1007/978-3-319-70688-7_14

0Citations

8Readers

Get full text

Abstract

.: This study performed a content analysis of data retrieved from 30 peer-reviewed scientific publications (1996–2016) that describe the applied algorithm models for data wrangling in Big Data. This analysis method explores and evaluates applied algorithm models of data applications in the area of data wrangling methods in Big Data. Data wrangling unifies messy and complex data by a procedure of planning, which involves, clustering, and grouping of untidy and intricate sets of for easy access for the purposes of trending themes useful for business or company planning. This application of data wrangling is not only for business use, but also for the convenience of individuals, business users that consume data directly in reports, or schemes that further process data by streaming it into targets such as data warehouses, called data lakes. This method sets- up easy access and analysis of all untidy data. Data streaming procedure are exceptionally useful for planning, small and big businesses, all around the world who use data non-stop and constantly to produce emerging trends, structure and schemes that inadvertently makes a difference when sustaining and customising business by simply streaming data it into warehouses, or in other words data storage pools. This study analyzed and found that commonly used statistical figures and algorithms are used by major data application, however the information technology area certainly faces security challenges. However, Data wrangling algorithms used in different data applications such as medical data, textual data, financial data, topological data, governmental data, educational science, galaxy data, etc. could use clustering methods as it is much effective than others. This study has analyzed and found significant comparisons and contrasts between algorithms along with data applications and evaluated them to identify certain superior methods over others. Moreover, it shows that there is a significant use of medical data in the big data research area. Our results show that data wrangling when clustering algorithm can solve medical data storage issues by clustering algorithms. Similarly, clustering algorithms are frequently used for clustering data sets to analyze information from raw data. Fifty percent of the literature found that clustering algorithms for Data wrangling method is beneficial for algorithms used in different data applications to thoroughly analyze and evaluate their importance. After the analysis of Clustering algorithm, suggestions are made for applications used by medical data for the data wrangling purposes. Graphic Abstract: A pictorial representation of the abstract of this research is shown in Fig. 1. Fig. 1

Author supplied keywords

Cite

CITATION STYLE

APA

Bashya, C., Halgamuge, M. N., & Mohammad, A. (2018). Review on analysis of the application areas and algorithms used in data wrangling in big data. In Lecture Notes on Data Engineering and Communications Technologies (Vol. 14, pp. 337–353). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-319-70688-7_14

Review on analysis of the application areas and algorithms used in data wrangling in big data

Abstract

Author supplied keywords

Cite

Register to see more suggestions