A large number of bone stick cultural relics have been unearthed in Weiyang Palace of Han Period Chang'an City, Xi'an City, Shaanxi Province, China. Utilizing deep learning-based image classification methods can improve the efficiency of categorizing bone stick fracture locations and colors. However, due to factors such as variations in surface texture features and different degrees of wear on the bone sticks, the classification accuracy of the model is low. To address this issue, this paper proposes a bone stick image classification method using the YOLOv5s-ViT cascade model. The method incorporates the C3CA attention module to enhance the model's recognition of fracture areas, reduce interference from the image background, and improve the effectiveness of the Vision Transformer self-attention mechanism. Furthermore, Increase the learning rate by comparing the training data of the test to improve the training efficiency of the model. Lastly, the Batch Normalization layer is introduced to normalize the output of the encoder, suppress model training divergence, and enhance generalization ability. The experimental results show that the average recognition accuracy of the bone stick fracture region features of this paper reaches 97.6%, the average recall rate reaches 93.3%, which is 2.1% and 6.0% higher than the YOLOv5s model, respectively, and the average classification accuracy reaches 88.7%, which is 7.9% higher than the Vision Transformer model, and can effectively improve the classification accuracy of bone stick images.
CITATION STYLE
Liang, H., Wang, H., Mao, L., Liu, R., Wang, Z., & Wang, K. (2023). Bone Stick Image Classification Study Based on C3CA Attention Mechanism Enhanced Deep Cascade Network. IEEE Access, 11, 94057–94068. https://doi.org/10.1109/ACCESS.2023.3310472
Mendeley helps you to discover research relevant for your work.