Stream processing is used in various fields. In the field of big data, stream aggregation is a popular processing technique, but it suffers serious setbacks when the order of events (e.g., stream elements) occurring is different from the order of events arriving to the systems. Such data streams are called 'non-FIFO steams'. This phenomenon usually occurs in a distributed environment due to many factors, such as network disruptions, delays, etc. Many analyzing scenarios require efficient processing of such non-FIFO streams to meet various data processing requirements. This paper proposes an efficient scalable checkpoint-based bidirectional indexing approach, called CPiX, for faster real-time analysis over non-FIFO streams. CPiX maintains the partial aggregation results in an on-demand manner per checkpoint. CPiX needs less time and space than the state-of-the-art approach. Extensive experiments confirm that CPiX can deal with out-of-order streams very efficiently and is, on average, about 3.8 times faster than the state-of-the-art approach while consuming less memory.
CITATION STYLE
Bou, S., Kitagawa, H., & Amagasa, T. (2022). CPiX: Real-Time Analytics Over Out-of-Order Data Streams by Incremental Sliding-Window Aggregation. IEEE Transactions on Knowledge and Data Engineering, 34(11), 5239–5250. https://doi.org/10.1109/TKDE.2021.3054898
Mendeley helps you to discover research relevant for your work.