This paper introduces the processing process of the distributed file system (HDFS, MapReduce) which is the core of the Hadoop distributed computing platform and introduces the data warehouse tool Hive and the distributed database Hbase. Spark is a big data distributed programming framework, which not only implements MapReduce operator map function and reduce function and calculation model, but also provides more abundant operators. This paper describes the ecosystem of Hadoop platform based on HDFS, MapReduce and data warehouse tool Hive.
CITATION STYLE
Xu, H., Chen, X., & Fan, G. (2020). Ecosystem Description of Hadoop Platform Based on HDFS, MapReduce and Data Warehouse Tool Hive. In Advances in Intelligent Systems and Computing (Vol. 928, pp. 1127–1133). Springer Verlag. https://doi.org/10.1007/978-3-030-15235-2_149
Mendeley helps you to discover research relevant for your work.