Purpose: To determine if natural language processing (NLP) algorithm assessment of thoracic CT imaging reports correlated with the incidence of official COVID-19 cases in the United States. Materials and Methods: With the use of de-identified HIPAA compliant patient data from a common imaging platform interconnected with over 2100 facilities covering all 50 states, three NLP algorithms were developed to track positive CT imaging features of respiratory illness typical in SARS-CoV-2 viral infection. Findings were compared against the number of official COVID-19 daily, weekly, and state-wide. Results: The NLP algorithms were applied to 450,114 patient chest CT comprehensive reports gathered from January 1 to October 3, 2020. The best performing NLP model exhibited strong correlation with daily official COVID-19 cases (r2 = 0.82, P, .005). The NLP models demonstrated an early rise in cases followed by the increase of official cases, suggesting the possibility of an early predictimarker, with strong correlation to official cases on a weekly basis (r2 = 0.91, P, .005). There was also substantial correlation between the NLP and official COVID-19 incidence by state (r2 = 0.92, P, .005). Conclusion: With the use of big data, a machine learning–based NLP algorithm was developed that can track imaging findings of respratory illness detected on chest CT imaging reports with strong correlation with the progression of the COVID-19 pandemic in the United States.
CITATION STYLE
Cury, R. C., Megyeri, I., Lindsey, T., Macedo, R., Batlle, J., Kim, S., … Clark, R. H. (2021). Natural language processing and machine learning for detection of respiratory illness by chest ct imaging and tracking of covid-19 pandemic in the united states. Radiology: Cardiothoracic Imaging, 3(1). https://doi.org/10.1148/ryct.2021200596
Mendeley helps you to discover research relevant for your work.