LIDP: A Lung Image Dataset with Pathological Information for Lung Cancer Screening

1Citations
Citations of this article
3Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Lung cancer has been one of the greatest lethal cancers worldwide. Computed Tomograph (CT) makes it possible to diagnose lung cancer at an early stage, which can significantly reduce its mortality. In recent years, deep neural networks (DNN) have been widely used to improve the accuracy of benign and malignant pulmonary nodules classification. But the limitation of DNN approach is that AI model’s performance and generalization highly depend on the size and quality of the training data. With our best knowledge, almost all existing public lung nodule datasets, e.g., LIDC-IDRI, obtain the crucial benign and malignant labels by radiographic analysis, instead of pathological examination. In this paper, we argue that, without pathology report and hence lack of labels’ authenticity, LIDC-IDRI based machine-learning (ML) models are short of generalization. To prove our hypothesis, we introduce a new lung CT image dataset with pathological information (LIDP), for lung cancer screening. LIDP contains 990 samples, including 783 malignant samples and 207 benign samples. More critically, the labels of all samples have been all examined by pathological biopsy. We evaluate various of existing LIDC-based state-of-the-art (SOTA) models on LIDP. Our experimental results show the extreme poor generalization ability of existing SOTA models that are trained on LIDC-IDRI dataset. Our scientific conclusion is striking: the distributions of these datasets are significantly different. We claim that the LIDP dataset is a very valuable addition to the existing datasets like LIDC-IDRI. LIDP can be well used for independent testing or for training new ML models for lung cancer early detection.

Cite

CITATION STYLE

APA

Shao, Y., Wang, M., Mai, J., Fu, X., Li, M., Zheng, J., … Ji, H. (2022). LIDP: A Lung Image Dataset with Pathological Information for Lung Cancer Screening. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 13433 LNCS, pp. 770–779). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-16437-8_74

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free