Big data means large amount of data requires new technologies for its faster processing. It is ineffective to process the large amount of data with traditional devices. Big data provides an extra advantage in business and better service delivery. Big data brings a new change in decision making process of various business organizations. Big data has many challenges related to the 5Vs-Volume, Velocity, Variety, Veracity and Value. Hadoop is a Big Data tool used to process larger amounts of Data. It has many subcomponents work together to achieve the goal of faster processing. Apache Hive and Apache Pig are tools used to access data in different ways in Hadoop Ecosystem. Apache Hive depends upon SQL like queries while Apache Pig uses scripts. These two tools uses MapReduce or Apache Tez framework to access data. In this paper we analyze how these two frameworks uses Hadoop Distributed File System (HDFS) by comparing them in both theoretical and empirical way.
CITATION STYLE
Singh, R., & Kaur, P. J. (2016). Theoretical and empirical analysis of usage of mapreduce and apache Tez in big data. In Smart Innovation, Systems and Technologies (Vol. 51, pp. 529–536). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-319-30927-9_52
Mendeley helps you to discover research relevant for your work.