Automatic SQL to HQL-NoSQL Querying using PostgreSQL and Integrated Hive-HBase

1Citations
Citations of this article
37Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The amount of digital data is constantly growing in almost all fields. This data is divided into two categories, structured and unstructured data. Non-structural databases known as NoSQL became one of the main fields of big data. Many companies are still using relational databases like PostgreSQL and MySQL. But with the rapid evolution and diversity of stored data, companies find themselves obliged to use big data tools like HBase or Hive. Big data is characterized by its capacity, speed, and ability to store diverse types of data. Data analysis and high storage capacity are the main reasons for companies to search for new database systems. Data migration to new systems is associated with the modification of the existing data and applications. This process costs a lot to adopt new specialists to handle this transition. Furthermore, due to different sources of data in old systems, e.g., real-time applications that are continuously collecting new data, companies will not be able to leave relational databases. For this reason, we present a system, termed Automatic Query Language, or AQL in short form, for migrating data from PostgreSQL to integrated HBase/Hive databases. In addition, we provide a platform that allows any user to query automatically PostgreSQL, Hive, and HBase databases using SQL query only. Querying the system is related to where each big data tool’s performance is better. After the platform was completed, we were able to insert and select data from both relational databases and big data components. Join operation was not a problem because complex queries for analysis were executed using Hive which was integrated with HBase. The tested AQL system proved that HBase can insert data with more efficiency than PostgreSQL and Hive, and that select query in Hive has a better performance than PostgreSQL for big data size, whereas, for small data size, the performance of PostgreSQL is better.

Cite

CITATION STYLE

APA

Saada, O., & Daba, J. (2023). Automatic SQL to HQL-NoSQL Querying using PostgreSQL and Integrated Hive-HBase. WSEAS Transactions on Information Science and Applications, 20, 16–27. https://doi.org/10.37394/23209.2023.20.3

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free