Abstract
Sustainable agriculture relies on timely and efficient detection of leaf diseases to prevent the risk of crop contamination and the dependency on chemical treatments. The Convolutional Neural Network (CNN) has made a significant contribution in improving image-based disease detection. However, traditional CNNs often struggle with complex patterns, require large datasets, high computational costs, and memory consumption. In addition, Vision Transformers (ViT) have been established as a powerful tool because of their ability to capture long-range dependencies and complex patterns. However, it does not capture local and multiscale features of images, which is one of the primary requirements of image classification. To address these issues, this work proposes a novel approach called ViTCon that combines the advantages of CNN and ViT for the classification of leaf disease. The experimental results showed that the ViTCon approach outperforms than other approaches, evaluated on three different publicly available datasets of corn, rice, and wheat. The proposed approach shows the accuracy for corn, wheat and rice plants as 99.19%, 99.46% and 99.24% for binary classification, 99.20%, 99.46% and 99.28% for crop-wise multiclass classification, with overall average accuracy of 99.56% of multiclass classification. The strong performance of the ViTCon model ensures its potential in agricultural environment.
Author supplied keywords
Cite
CITATION STYLE
Verma, P. K., Gupta, N., Sharma, A. K., Rakesh, N., & Gulhane, M. (2025). ViTCon: a hybrid CNN-ViT model for improved plant leaf disease detection. Cogent Food and Agriculture, 11(1). https://doi.org/10.1080/23311932.2025.2562165
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.