Hadoop Distributed File System (HDFS) is the primary storage system of Hadoop. Many applications use HDFS as the underlying file system due to its portability and fault-tolerance. The most popular benchmark to measure the I/O performance of HDFS is TestDFSIO which involves the MapReduce framework. However, there is a lack of standardized benchmark suite that can help users evaluate the performance of standalone HDFS and make comparisons for different networks and cluster configurations. In this paper, we design and develop a micro-benchmark suite that can be used to evaluate performance of HDFS operations. This paper also illustrates how this benchmark suite can be used to evaluate the performance results of HDFS installations over different networks/protocols and parameter configurations on modern clusters. © 2014 Springer-Verlag Berlin Heidelberg.
CITATION STYLE
Islam, N. S., Lu, X., Wasi-Ur-Rahman, M., Jose, J., & Panda, D. K. (2014). A micro-benchmark suite for evaluating HDFS operations on modern clusters. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8163 LNCS, pp. 129–147). Springer Verlag. https://doi.org/10.1007/978-3-642-53974-9_12
Mendeley helps you to discover research relevant for your work.