A Study of Web Information Extraction Technology Based on Beautiful Soup

undefined; undefined; Chunmei Zheng; Guomei He; Zuojie Peng

Journal ArticleOPEN ACCESS

A Study of Web Information Extraction Technology Based on Beautiful Soup

Zheng C
et al.

Journal of Computers (2015) 10(6) 381-387

DOI: 10.17706/jcp.10.6.381-387

N/ACitations

63Readers

Abstract

In the context of comparative analysis of common web information retrieval technologies, this article discusses the principles and applications of Beautiful Soup, a vertical information search technology based on DOM tree structure. Supported by actual system examples and centering on the system architecture and core technology, this article discusses how to use Beautiful Soup to conduct deep information retrieval for partially structured webpage data, obtain directional information, reorganize the information, and then send the information to users via text message. The test results demonstrate that the web crawler achieved over 95% accuracy, satisfying the needs for commercial application.

Cite

CITATION STYLE

APA

Zheng, C., He, G., & Peng, Z. (2015). A Study of Web Information Extraction Technology Based on Beautiful Soup. Journal of Computers, 10(6), 381–387. https://doi.org/10.17706/jcp.10.6.381-387

A Study of Web Information Extraction Technology Based on Beautiful Soup

Abstract

Cite

Register to see more suggestions