Performance of a deep learning algorithm compared with radiologic interpretation for lung cancer detection on chest radiographs in a health screening population

53Citations
Citations of this article
80Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Background: The performance of a deep learning algorithm for lung cancer detection on chest radiographs in a health screening population is unknown. Purpose: To validate a commercially available deep learning algorithm for lung cancer detection on chest radiographs in a health screening population. Materials and Methods: Out-of-sample testing of a deep learning algorithm was retrospectively performed using chest radiographs from individuals undergoing a comprehensive medical check-up between July 2008 and December 2008 (validation test). To evaluate the algorithm performance for visible lung cancer detection, the area under the receiver operating characteristic curve (AUC) and diagnostic measures, including sensitivity and false-positive rate (FPR), were calculated. The algorithm performance was compared with that of radiologists using the McNemar test and the Moskowitz method. Additionally, the deep learning algorithm was applied to a screening cohort undergoing chest radiography between January 2008 and December 2012, and its performances were calculated. Results: In a validation test comprising 10 285 radiographs from 10 202 individuals (mean age, 54 years ± 11 [standard deviation]; 5857 men) with 10 radiographs of visible lung cancers, the algorithm's AUC was 0.99 (95% confidence interval: 0.97, 1), and it showed comparable sensitivity (90% [nine of 10 radiographs]) to that of the radiologists (60% [six of 10 radiographs]; P = .25) with a higher FPR (3.1% [319 of 10 275 radiographs] vs 0.3% [26 of 10 275 radiographs]; P < .001). In the screening cohort of 100 525 chest radiographs from 50 070 individuals (mean age, 53 years ± 11; 28 090 men) with 47 radiographs of visible lung cancers, the algorithm's AUC was 0.97 (95% confidence interval: 0.95, 0.99), and its sensitivity and FPR were 83% (39 of 47 radiographs) and 3% (2999 of 100 478 radiographs), respectively. Conclusion: A deep learning algorithm detected lung cancers on chest radiographs with a performance comparable to that of radiologists, which will be helpful for radiologists in healthy populations with a low prevalence of lung cancer.

Cite

CITATION STYLE

APA

Lee, J. H., Sun, H. Y., Park, S., Kim, H., Hwang, E. J., Goo, J. M., & Park, C. M. (2020). Performance of a deep learning algorithm compared with radiologic interpretation for lung cancer detection on chest radiographs in a health screening population. Radiology, 297(3), 687–696. https://doi.org/10.1148/radiol.2020201240

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free