One of the purposes of Big Data systems is to support analysis of data gathered from heterogeneous data sources. Since data warehouses have been used to achieve the same goal, they could be leveraged also to provide analysis of Big Data. The problem of adapting data warehouse data and schemata to changes in user requirements and data sources has been studied by many researchers worldwide. However, innovative methods must be developed also to support evolution in Big Data systems. In this paper, we analyze architectures designed for Big Data processing and analysis described in the literature with the purpose to identify the most appropriate solution for the evolution problem. We concentrate on four architecture types: data lakes, virtual integration, polystores, and λ-architecture, and, in addition to them, we consider solutions that apply data warehouse/OLAP methods to Big Data processing and analysis. Finally, we describe our proposal of an architecture that allows to perform different kinds of analytical tasks on Big Data retrieved from multiple heterogeneous data sources with different latency and is capable of processing changes in data sources as well as evolving analysis requirements.
CITATION STYLE
Solodovnikova, D., & Niedrite, L. (2020). Handling evolution in big data architectures. Baltic Journal of Modern Computing, 8(1), 21–47. https://doi.org/10.22364/BJMC.2020.8.1.02
Mendeley helps you to discover research relevant for your work.