The 'Big Data' of yesterday is the 'data' of today. As technology progresses, new challenges arise and new solutions are developed. Due to the emergence of Internet of Things applications within the last decade, the field of Data Mining has been faced with the challenge of processing and analysing data streams in real-Time, and under high data throughput conditions. This is often referred to as the Velocity aspect of Big Data. Whereas there are numerous reviews on Data Stream Mining techniques and applications, there is very little work surveying Data Stream processing paradigms and associated technologies, from data collection through to pre-processing and feature processing, from the perspective of the user, not that of the service provider. In this article, we evaluate a particular type of solution, which focuses on streaming data, and processing pipelines that permit online analysis of data streams that cannot be stored as-is on the computing platform. We review foundational computational concepts such as distributed computation, fault-Tolerant computing, and computational paradigms/architectures. We then review the available technological solutions, and applications that pertain to data stream mining as case studies of these theoretical concepts. We conclude with a discussion of the field of data stream processing/analytics, future directions and research challenges.
CITATION STYLE
Dubuc, T., Stahl, F., & Roesch, E. B. (2021). Mapping the Big Data Landscape: Technologies, Platforms and Paradigms for Real-Time Analytics of Data Streams. IEEE Access, 9, 15351–15374. https://doi.org/10.1109/ACCESS.2020.3046132
Mendeley helps you to discover research relevant for your work.