The expansion of the data is swelling at an astonishing pace. The increasing usage of the digital technology massively increases the growth of the data generated by individuals or organizations/corporation produces big data. The big data environment generally uses the Map reduce framework which will take care of the job execution in Hadoop. Nowadays SPARK is becoming a popular framework which is written on top of the Hadoop framework to elevate the execution speed using runtime environment. A novel CCCa framework is proposed in this paper which includes the classification, clustering and cache techniques. This input data quality is improved by data cleansing activity. Similarity based clustering technique is involved to partition the job data into various clusters. Classification phase predicts the behavior of the data and artificial neural network (ANN) is applied for the classification of big data by means of the back propagation technique. The cache substitution technique is recommended to avoid the repetition of job processing. The proposed framework assures the consumption of less memory, computational time and achieved a higher level of accuracy and the prediction of the behavior of the dataset.
CITATION STYLE
Subramanian, S. M., Vijayalakshmi, S., Venkataraman, B., Venkumar, P., & Rathikaa Sre, R. M. (2018). CCCa framework - Classification system in big data environment with clustering and cache concepts. In Advances in Intelligent Systems and Computing (Vol. 614, pp. 44–53). Springer Verlag. https://doi.org/10.1007/978-3-319-60618-7_5
Mendeley helps you to discover research relevant for your work.