A hadoop-based packet trace processing tool

Yeonhee Lee; Wonchul Kang; Youngseok Lee

Conference Proceedings

A hadoop-based packet trace processing tool

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2011) 6613 LNCS 51-63

DOI: 10.1007/978-3-642-20305-3_5

39Citations

43Readers

Get full text

Abstract

Internet traffic measurement and analysis has become a significantly challenging job because large packet trace files captured on fast links could not be easily handled on a single server with limited computing and memory resources. Hadoop is a popular open-source cloud computing platform that provides a software programming framework called MapReduce and the distributed filesystem, HDFS, which are useful for analyzing a large data set. Therefore, in this paper, we present a Hadoop-based packet processing tool that provides scalability for a large data set by harnessing MapReduce and HDFS. To tackle large packet trace files in Hadoop efficiently, we devised a new binary input format, called PcapInputFormat, hiding the complexity of processing binary-formatted packet data and parsing each packet record. We also designed efficient traffic analysis MapReduce job models consisting of map and reduce functions. To evaluate our tool, we compared its computation time with a well-known packet-processing tool, CoralReef, and showed that our approach is more affordable to process a large set of packet data. © 2011 Springer-Verlag Berlin Heidelberg.

Cite

CITATION STYLE

APA

Lee, Y., Kang, W., & Lee, Y. (2011). A hadoop-based packet trace processing tool. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6613 LNCS, pp. 51–63). https://doi.org/10.1007/978-3-642-20305-3_5

A hadoop-based packet trace processing tool

Abstract

Cite

Register to see more suggestions