COVIDSeer: Extending the CORD-19 Dataset

4Citations
Citations of this article
8Readers
Mendeley users who have this article in their library.
Get full text

Abstract

We develop an enhanced version of CORD-19 dataset released by the Allen Institute for AI. Tools in the SeerSuite project are used to exploit information in original articles not directly provided in the CORD-19 datasets. We add 728 new abstracts, 70,102 figures and 31,446 tables with captions that are not provided in the current data release. We also built a vertical search engine COVIDSeer based on the new dataset we created. COVIDSeer has a relatively simple architecture with features like keyword filtering, and similar paper recommendation. The goal was to provide a system and dataset that can help scientists better navigate through the literature concerning COVID-19. The enriched dataset can serve as a supplement to the existing dataset. The search engine, which offers keyphrase-enhanced search, will hopefully help biomedical and life science researchers, medical students, and the general public to more effectively explore coronavirus-related literature. The entire data set and the system will be made open source.

Cite

CITATION STYLE

APA

Rohatgi, S., Karishma, Z., Chhay, J., Keesara, S. R. R., Wu, J., Caragea, C., & Giles, C. L. (2020). COVIDSeer: Extending the CORD-19 Dataset. In Proceedings of the ACM Symposium on Document Engineering, DocEng 2020. Association for Computing Machinery, Inc. https://doi.org/10.1145/3395027.3419597

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free