CT image classification of liver tumors based on multi-scale and deep feature extraction

Jingyi Mao; Yuqing Song; Zhe Liu

Journal ArticleOPEN ACCESS

CT image classification of liver tumors based on multi-scale and deep feature extraction

Journal of Image and Graphics (2021) 26(7) 1704-1715

DOI: 10.11834/jig.200500

3Citations

7Readers

Abstract

Objective: Liver tumors are the most aggressive malignancies in the human body. The definition of lesion type and lesion period based on computed tomography(CT) images determines the diagnosis and strategy of the treatment, which requires professional knowledge and rich experience of experts to classify them. Fatigue is easily experienced when the workload is heavy, and even experienced senior experts have difficulty avoiding misdiagnosis. Deep learning can avoid the drawbacks of traditional machine learning that takes a certain amount of time to manually extract the features of the image and perform dimensionality reduction, and is capable of extracting high-dimensional features of an image. Using deep learning to assist doctors in diagnosis is important. In the existing medical image classification task, the challenge of the low accuracy of tumor classification, the weak capability of the feature extraction, and the rough dataset still remain. To address these tasks, this study presents a method with a multi-scale and deep feature extraction classification network. Method: First, we extract the region of interest (ROI) according to the contours of the liver tumors that were labeled by experienced radiologists, along with the ROI of healthy livers. The ROI is extracted to capture the features of the lesion area and surrounding tissue, which is relative to the size of the lesion. Due to the different sizes of the lesion area, the size of the extracted ROI is also different. Then, the pixel value is converted and data augmentation is performed. The dataset is Hounsfield windows, the range of CT values is (-1 024, 3 071), and the range of digital imaging and communications in medicine(DICOM) image is (0, 4 096). The pixel values of DICOM images have to be converted to CT values. First, we read rescale_intercept and rescale_slope from the DICOM header file, and then we use the formula to convert. Thereafter, we limit the CT values of liver datasets to [-100, 400] Hounsfield HU to avoid the influence of the background noise of the unrelated organs or tissues. We perform several data augmentation methods such as flipping, rotation, and transforming to expand the diversity of the datasets. Then, these images are sent into the MD_SENet for classification. The MD_SENet network is a SE_ResNet-like convolution neural network that can achieve end-to-end classification. The SE_ResNet learns the important features automatically from each channel to strengthen the useful features and suppress useless ones. MD_SENet network is much deeper than SE_ResNet. Our contributions are the following: 1) Hierarchical residual-like connections are used to improve multi-scale expression and increase the receptive field of each network layer. In the study, the image features after 1×1 convolution layers are divided into four groups. Each group of features passes through the 3×3 residual-like convolution groups, which improves the multi-scale feature extraction of networks and enhances the acquisition of focus areas features. 2) Channel attention and spatial attention are used to further focus on effective information on medical images. We let the feature images first go through the channel attention module, then we multiply its input and output to go through the spatial attention module. Then, we multiply the output of the spatial attention module and its input, which can pay more attention to the features of the lesion area and reduce the influence of background noise. 3) Atrous convolutions connected in parallel which refer to the spatial pyramid pooling, then we use 1×1 convolution layers to strengthen the feature. Finally, we concatenate the output and use softmax in classification. In this way, we can expand the receptive field and increase the image resolution, which can improve the feature expression ability and prevent the loss of information effectively. 4) The ordinary convolution is replaced by octave convolution to reduce the number of parameters and improve the classification performance. In this study, we compared the results of DenseNet, ResNet, MnasNet, MobileNet, ShuffleNet, SK_ResNet, and SE_ResNet with those of our MD_SENet, all of which were trained on the liver dataset. During the experiment, due to the limitation of graphics processing unit(GPU) memory, we set a batch size of 16 with Adam optimization and learning rate of 0.002 for 150 epochs. We used the dataset in Pytorch framework, Ubuntu 16.04. All experiments used the NVIDIA GeForce GTX 1060 Ti GPU to verify the effectiveness of our proposed method. Result: Our training set consists of 4 096 images and the test set consists of 1 021 images for the liver dataset. The classification accuracy of our proposed method is 87.74% and is 9.92% higher than the baseline (SE_ResNet101). Our module achieves the best result compared with the state-of-the-art network and achieved 86.04% recall, 87% precision, 86.42% F1-score under various evaluation indicators. Ablation experiments are conducted to verify the effectiveness of the method. Conclusion: In this study, we proposed a method to classify the liver tumors accurately. We combined the method into professional medical software so that we can provide a foundation that physicians can use in early diagnosis and treatment.

Author supplied keywords

Cite

CITATION STYLE

APA

Mao, J., Song, Y., & Liu, Z. (2021). CT image classification of liver tumors based on multi-scale and deep feature extraction. Journal of Image and Graphics, 26(7), 1704–1715. https://doi.org/10.11834/jig.200500

CT image classification of liver tumors based on multi-scale and deep feature extraction

Abstract

Author supplied keywords

Cite

Register to see more suggestions