Abstract
Social media information collection and preservation is a hot issue in the field of Web Archive. This paper makes a comparative analysis of the different social media information collection methods, deeply analyzes the key techniques of the three important parts-collection, evaluation and preservation in the information collection process, and provides the solutions for the problems in the key techniques. Through analysis, the collection method suitable for the social media information is found. In terms of the problem that social websites impose restrictions on the call frequency of API, the paper provides solutions, for example, use the multiplexing mechanism, use the naive Bayesian algorithm to solve the spam filtering problem, and use MongoDB Dbased distributed storage to store collected massive data.
Author supplied keywords
Cite
CITATION STYLE
Huang, X. (2021). Research on the Methods and Key Techniques of Web Archive Oriented Social Media Information Collection. Journal of Web Engineering, 20(8), 2473–2490. https://doi.org/10.13052/jwe1540-9589.20812
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.