A Study of Web Information Extraction Technology Based on Beautiful Soup

  • Zheng C
  • et al.
N/ACitations
Citations of this article
63Readers
Mendeley users who have this article in their library.

Abstract

In the context of comparative analysis of common web information retrieval technologies, this article discusses the principles and applications of Beautiful Soup, a vertical information search technology based on DOM tree structure. Supported by actual system examples and centering on the system architecture and core technology, this article discusses how to use Beautiful Soup to conduct deep information retrieval for partially structured webpage data, obtain directional information, reorganize the information, and then send the information to users via text message. The test results demonstrate that the web crawler achieved over 95% accuracy, satisfying the needs for commercial application.

Cite

CITATION STYLE

APA

Zheng, C., He, G., & Peng, Z. (2015). A Study of Web Information Extraction Technology Based on Beautiful Soup. Journal of Computers, 10(6), 381–387. https://doi.org/10.17706/jcp.10.6.381-387

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free