A study on the quality of novel coronavirus (COVID-19) official datasets

Afshin Ashofteh; Jorge M. Bravo

Journal ArticleOPEN ACCESS

A study on the quality of novel coronavirus (COVID-19) official datasets

Statistical Journal of the IAOS (2020) 36(2) 291-301

DOI: 10.3233/SJI-200674

32Citations

52Readers

Get full text

Abstract

Policy makers depend on complex epidemiological models that are compelled to be robust, realistic, defendable and consistent with all relevant available data disclosed by official authorities which is deemed to have the highest quality standards. This paper analyses and compares the quality of official datasets available for COVID-19. We used comparative statistical analysis to evaluate the accuracy of data collection by a national (Chinese Center for Disease Control and Prevention) and two international (World Health Organization; European Centre for Disease Prevention and Control) organisations based on the value of systematic measurement errors. We combined excel files, text mining techniques and manual data entries to extract the COVID-19 data from official reports and to generate an accurate profile for comparisons. The findings show noticeable and increasing measurement errors in the three datasets as the pandemic outbreak expanded and more countries contributed data for the official repositories, raising data comparability concerns and pointing to the need for better coordination and harmonized statistical methods. The study offers a COVID-19 combined dataset and dashboard with minimum systematic measurement errors, and valuable insights into the potential problems in using databanks without carefully examining the metadata and additional documentation that describe the overall context of data.

Author supplied keywords

Cite

CITATION STYLE

APA

Ashofteh, A., & Bravo, J. M. (2020). A study on the quality of novel coronavirus (COVID-19) official datasets. Statistical Journal of the IAOS, 36(2), 291–301. https://doi.org/10.3233/SJI-200674

A study on the quality of novel coronavirus (COVID-19) official datasets

Abstract

Author supplied keywords

Cite

Register to see more suggestions