Abstract
In the context of comparative analysis of common web information retrieval technologies, this article discusses the principles and applications of Beautiful Soup, a vertical information search technology based on DOM tree structure. Supported by actual system examples and centering on the system architecture and core technology, this article discusses how to use Beautiful Soup to conduct deep information retrieval for partially structured webpage data, obtain directional information, reorganize the information, and then send the information to users via text message. The test results demonstrate that the web crawler achieved over 95% accuracy, satisfying the needs for commercial application.
Cite
CITATION STYLE
Zheng, C., He, G., & Peng, Z. (2015). A Study of Web Information Extraction Technology Based on Beautiful Soup. Journal of Computers, 10(6), 381–387. https://doi.org/10.17706/jcp.10.6.381-387
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.