In the realm of agriculture, the health and productivity of crops have become a major concern for farmers worldwide. Paddy, or rice, stands as one of the most essential staple crops, feeding billions of people globally. However, the cultivation of paddy comes with inherent challenges, as the crops are susceptible to various diseases and pests that can significantly impact yield. The primary goal of this field, focusing on the classification of paddy diseases, is to utilize advanced deep learning models to precisely categorize paddy leaf images.
This dataset contains 10407 labeled paddy leaf images across ten classes (nine diseases and normal leaf) along with additional metadata for each image, such as the paddy variety and age. The test images contain about 3469 paddy leaf images randomly shuffled for prediction purposes.
The first bar plot indicates that the majority of the dataset comprises samples from the rice variety ADT45, suggesting its prominence or perhaps higher availability in the dataset, while the rice variety Surya and RR appears to have the least representation in the dataset, indicating a lower prevalence compared to other varieties. The second plot provides an overview of the number of diseased paddy images across nine distinct disease categories, along with a category for normal images.
![image description](https://blogs.gwu.edu/aparna-shankar/files/2024/03/plot-e74bd8142fd1fd29-1024x931.png)
The following image provides a glimpse of the different types of diseased paddy categories, namely - Hispa, Tungro, bacterial leaf blight, downy mildew, blast, bacterial leaf streak, brown spot, dead heart, bacterial panicle blight and normal paddy.
![image preview](https://blogs.gwu.edu/aparna-shankar/files/2024/03/image-f2852a04c6d9505e-1024x370.png)
Next, the preprocessing function is utilized to normalize the pixel values, that helps in stabilizing the training process and improving convergence. The function casts the image data type to float and by dividing pixel values by 255, it scales the pixel values to the range [0, 1], making the optimization process more efficient.
To model is trained by integrating DenseNet121 architecture as the base layer, utilizing pre-trained weights from the ImageNet dataset. This model was chosen because it is widely used for various computer vision tasks, including image classification, object detection, and segmentation. Leveraging transfer learning, the model incorporates DenseNet121 for feature extraction, followed by the addition of Dense layers for fine-tuning and classification.
To further prevent overfitting and improve generalization performance, early stopping is used as a callback during model training to halt the training process when a monitored metric stops improving. In this instance, the validation loss is monitored, and the weights of the model that produced the best performance on the validation set are restored. The compiled model is then fit with 80% of images (8326) for training and remaining 20% (2081 images) for validation.
A visualization depicting the validation and training accuracies plotted against epochs was plotted as shown below.
![accuracy plots](https://blogs.gwu.edu/aparna-shankar/files/2024/03/image-ccfc51bffde9813b.png)
This plot implies that as the model trains over successive epochs, both the validation and training accuracies become more aligned. Notably, the model achieves a validation accuracy of 93% with a loss of 24.62%.
Lastly, the trained model is deployed to predict labels for the paddy test images. For improved understanding of the model's outputs, the subsequent image displays a set of first 10 images from the test set for which disease categories have been predicted.
![prediction labels](https://blogs.gwu.edu/aparna-shankar/files/2024/03/image-d9e4c88c0307f9fd-1024x418.png)
Conclusion
This project aims to develop a deep learning-based solution for the automatic classification of diseases in paddy plants using computer vision techniques. Leveraging transfer learning with the DenseNet121 architecture pretrained on the ImageNet dataset, the model is trained to accurately identify various disease categories affecting paddy crops. By automating disease detection in paddy, this project seeks to empower farmers with timely insights to mitigate crop losses and ensure food security.