Performance analysis of queries with hive optimized data models

1Citations
Citations of this article
3Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The processing of structured data in Hadoop is achieved by Hive, a data warehouse tool. It is present on top of Hadoop and helps to analyze, query, and review the Big Data. The execution time of the queries has drastically reduced by using Hadoop MapReduce. This paper presents the detailed comparison of various optimizing techniques for data models like partitioning and bucket methods to improve the processing time for Hive queries. The implementation is done on data from New York Police Portal using AWS services for storage. Hive tool in Hadoop ecosystem is used for querying data. Use of partitioning has shown remarkable improvement in terms of execution time.

Cite

CITATION STYLE

APA

Sharma, M., & Kaur, J. (2020). Performance analysis of queries with hive optimized data models. In Lecture Notes in Electrical Engineering (Vol. 597, pp. 687–698). Springer. https://doi.org/10.1007/978-3-030-29407-6_49

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free