In early 2020, the novel coronavirus, referred to as COVID-19 burst out. The Chinese people took the most comprehensive and rigorous control measures to fight against the COVID-19. Local health control departments reported infection data in a timely manner, which helped the public understand the development of the epidemic and take protective measures in advance. However, currently, no literature has analyzed the transmission characteristics of COVID-19 based on the structured data of large-scale patient cases and artificial intelligence. The detailed case data of patients in various regions are primarily recorded in text form, and the formats of report data in different provinces and cities differ, which makes it difficult to handle such data. To analysis around a large anonymous patient case data, we propose a method based on natural language processing technology to structure the case data. The proposed method can extract key information in the cases accurately and effectively with the help a pretrained model and a small number of labeled samples. By mining the patient's structured case data, we analyze the gender and age distribution, the main causes of infection, the characteristics of the incubation period, and epidemic trends in detail. Using big data on travel, a method was developed to estimate the number of infected individuals in Wuhan prior the restrictions were put into effect. This method helps people understand the real epidemic situation and take execute early protective measures. It is also helps government departments make evidence-based decisions, dispatch medical staff, and allocate medical resources as quickly as possible.
CITATION STYLE
Huang, Z., Wang, Z., Jiang, L., Zhang, R., Lei, C., Liu, X., & Xie, X. (2020). Analysis of COVID-19 spread characteristics and infection numbers based on large-scale structured case data. Scientia Sinica Informationis, 50(12), 1882–1902. https://doi.org/10.1360/SSI-2020-0029
Mendeley helps you to discover research relevant for your work.