Abstract
Microblogs, e.g., tweets, reviews, or comments on news websites and social media, have become so popular among web users that many applications are exploiting them for different types of analysis. The distinguishing characteristics of microblogs have motivated a lot of research for managing such data. However, the developed technology for microblogs is still scattered efforts here and there which leads to several data management gaps that limits supporting microblogs-centric applications end-to-end. Our research aims to provide a holistic system approach to manage microblogs data, so that whoever builds new functionality on microblogs can seamlessly exploit a single data management system to power his applications. In this paper, we present a full proposal for Kite; the f rst holistic system that provides end-to-end management for microblogs data. Kite aims to f ll the gap in existing systems to support scalable queries with selective search criteria on data that comes in high velocity and adds up to large volumes (billions of records). To this end, the system is going to exploit and extend the infrastructure of Apache Spark system. Throughout the paper, we represent a roadmap for the accomplished contributions, on-going contributions towards the f rst cut realization of Kite, and future contributions to iteratively improve the system maturity and capabilities.
Cite
CITATION STYLE
Magdy, A. (2016). Scalable microblogs data management. In Proceedings of the ACM SIGMOD International Conference on Management of Data (Vol. 26-June-01-July-2016, pp. 32–36). Association for Computing Machinery. https://doi.org/10.1145/2926693.2929898
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.