As an important distributed real-time computation system, Storm has been widely used in a number of applications such as online machine learning, continuous computation, distributed RPC, and more. Storm is designed to process massive data streams in real time. However, there have been few studies conducted to evaluate the performance characteristics clusters in Storm. In this paper, we analyze the performance of a Storm cluster mainly from two aspects, hardware configuration and parallelism setting. Key factors that affect the throughput and latency of the Storm cluster are identified, and the performance of Storm’s fault-tolerant mechanism is evaluated, which help users use the computation system more efficiently.
CITATION STYLE
Yan, H., Sun, D., Gao, S., & Zhou, Z. (2018). Performance Analysis of Storm in a Real-World Big Data Stream Computing Environment. In Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering, LNICST (Vol. 252, pp. 624–634). Springer Verlag. https://doi.org/10.1007/978-3-030-00916-8_57
Mendeley helps you to discover research relevant for your work.