The research of web parallel information extraction based on Hadoop

1Citations
Citations of this article
5Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Big data that are driven by three major trends such as cloud computing, social computing, and mobile computing are reshaping the business process, IT infrastructure and our capture of the enterprise, customer and Internet information and use. To extract the big data in the Internet, the enterprise needs a scalable, flexible, and manageable data infrastructure. Therefore, this paper is based on the Hadoop framework, to analyze and design the large data information extraction system. Measurement shows that the huge amounts of data extraction on the basis of cluster have great improvement in performance compared with single extraction, with high reliability and scalability. What is more? The research of this paper will provide better technical solutions to Web information extraction and sensitive information.

Cite

CITATION STYLE

APA

Ma, S., Shi, Q., & Xu, L. (2014). The research of web parallel information extraction based on Hadoop. Advances in Intelligent Systems and Computing, 255, 341–348. https://doi.org/10.1007/978-81-322-1759-6_41

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free