An Earthquake Emergency Web Data Cleaning and Classification Method Based on Word Frequency and Position Weighting

Shuai Liu; Meng Huang; Chenxi Li; Wenchao Lv; Zhonghao Wang

Journal ArticleOPEN ACCESS

An Earthquake Emergency Web Data Cleaning and Classification Method Based on Word Frequency and Position Weighting

Computational Intelligence and Neuroscience (2022) 2022

DOI: 10.1155/2022/6555392

0Citations

8Readers

Abstract

The speed of earthquake emergency web document data cleaning is one of the key factors affecting emergency rescue decision-making. Data classification is the core process of data cleaning, and the efficiency of data classification determines the speed of data cleaning. This article is based on earthquake emergency Web document data and HTML structural features, combined with TF-IDF Algorithm and information calculation model, improves the word frequency factor and location factor parameters, and proposes the weighted frequency algorithm P-TF-IDF for earthquake emergency Web documents. To filter out less frequent words and optimize the FastText model, N-gram Feature word vectors effectively improve the efficiency of Web document data classification; for text classification data, use missing data recognition rules, data classification rules, and data repair rules to design an artificial intelligence-based earthquake emergency network information data cleaning framework to detect invalid data sets value, complete data comparison and redundancy judgment, clean up data conflicts and data errors, and generate a complete data set without duplication. The data cleaning framework not only completes the fusion of earthquake emergency network information but also provides a data foundation for the visualization of earthquake emergency data.

Cite

CITATION STYLE

APA

Liu, S., Huang, M., Li, C., Lv, W., & Wang, Z. (2022). An Earthquake Emergency Web Data Cleaning and Classification Method Based on Word Frequency and Position Weighting. Computational Intelligence and Neuroscience, 2022. https://doi.org/10.1155/2022/6555392

An Earthquake Emergency Web Data Cleaning and Classification Method Based on Word Frequency and Position Weighting

Abstract

Cite

Register to see more suggestions