Optimizing database load and extract for big data era

K. T. Sridhar; M. A. Sakkeer

Conference Proceedings

Optimizing database load and extract for big data era

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2014) 8422 LNCS(PART 2) 503-512

DOI: 10.1007/978-3-319-05813-9_34

9Citations

19Readers

Get full text

Abstract

With growing and pervasive interest in Big Data, SQL relational databases need to compete with data management by Hadoop, NoSQL and NoDB. Database research has mainly focused on result generation by query processing. But SQL databases require data in-place before queries may be processed. The process of DB loading has been a bottleneck leading to external ETL/ELT techniques for loading large data sets. This paper focuses on DB engine level techniques for optimizing both data loads and extracts in an MPP, shared-nothing SQL database, dbX, available on in-house commodity hardware and cloud systems. The agile, data loading of dbX exploits parallelism at multiple levels to achieve TBs of data load per hour making it suitable for cloud and continuous actionable knowledge applications. Implementation techniques at DB engine level, extensions to load/extract syntax and performance results are presented. Load optimization techniques help to speed up data extract to flat files and CTAS type SQL queries too. We show linear scale up with cluster scale out for load/extract in public cloud and commodity hardware systems without recourse to database tuning or use of expensive database appliances. © 2014 Springer International Publishing Switzerland.

Author supplied keywords

Cite

CITATION STYLE

APA

Sridhar, K. T., & Sakkeer, M. A. (2014). Optimizing database load and extract for big data era. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8422 LNCS, pp. 503–512). Springer Verlag. https://doi.org/10.1007/978-3-319-05813-9_34

Optimizing database load and extract for big data era

Abstract

Author supplied keywords

Cite

Register to see more suggestions