The processing of structured data in Hadoop is achieved by Hive, a data warehouse tool. It is present on top of Hadoop and helps to analyze, query, and review the Big Data. The execution time of the queries has drastically reduced by using Hadoop MapReduce. This paper presents the detailed comparison of various optimizing techniques for data models like partitioning and bucket methods to improve the processing time for Hive queries. The implementation is done on data from New York Police Portal using AWS services for storage. Hive tool in Hadoop ecosystem is used for querying data. Use of partitioning has shown remarkable improvement in terms of execution time.
CITATION STYLE
Sharma, M., & Kaur, J. (2020). Performance analysis of queries with hive optimized data models. In Lecture Notes in Electrical Engineering (Vol. 597, pp. 687–698). Springer. https://doi.org/10.1007/978-3-030-29407-6_49
Mendeley helps you to discover research relevant for your work.