Big Data refers to large datasets and so it is not possible to store, manage and analyze it using commonly used software systems. The emergence of smart phones, social networks and online applications has led to the generation of massive amounts of structured, unstructured and semi structured data. Big data analytics has received sizeable attention since it offers a great opportunity to uncover potentials from heavy amounts of data. Data preprocessing techniques, when applied prior to analytics, can substantially improve the overall quality of the patterns mined and/or the time required for the actual mining. Thus this paper presents an efficient method for preprocessing data and also partitioning big dataset based on sensitivity parameters. The partitioned dataset can be uploaded to public and private cloud based on the importance of data in the partition. Thus hybrid cloud storage and processing of big data is supported by this approach. The experimental results show that the proposed method preprocesses and partition data with high accuracy and reduced processing time.
CITATION STYLE
Reena, M. J., & Shajin Nargun, A. (2019). Preprocessing big data for efficient storage and research. International Journal of Recent Technology and Engineering, 8(2 Special issue 3), 11–16. https://doi.org/10.35940/ijrte.B1003.0782S319
Mendeley helps you to discover research relevant for your work.