Leveraging LLMs for Real-Time Plant Disease Detection and Management addresses the critical challenges of plant disease detection in agriculture. Traditional methods are slow, errorprone,and lack scalability, threatening food security and crop yields. FarmTalk combines YOLOv8 for precise disease detection with a Retrieval-Augmented Generation (RAG) framework using NVIDIAEmbeddings and FAISS for semantic search. Integrated with LLMs like GPT-3.5 Turbo, it delivers real-time, actionable remediation recommendations.
Agriculture serves as the cornerstone of global food security, directly sustaining billions of lives and contributing significantly to the global economy. However, plant diseases remain one of the most persistent and devastating threats to agricultural productivity. These diseases reduce crop yields, degrade produce quality, and result in significant economic losses. According to research, plant diseases account for a staggering 20-30 Timely detection and management of plant diseases are crucial for minimizing their impact and ensuring sustainable farming practices. Early identification allows farmers to take precise actions to contain infections, reduce crop losses, and maintain yield quality. Traditional methods of plant disease detection, however, are labor-intensive, slow, and prone to errors. These methods typically rely on manual inspection by agricultural experts, which becomes impractical in large-scale farming operations and inaccessible for farmers in remote or resource-limited areas.
FarmTalk: Leveraging LLMs for Real-Time Plant Disease Detection and Management bridges these technological advancements into a unified platform. Combining YOLOv8’s disease detection capabilities with the contextual power of LLMs like GPT-3.5 Turbo, FarmTalk uses a Retrieval- Augmented Generation (RAG) framework for delivering accurate and actionable remediation recommendations. With a user-friendly interface that supports image uploads and live camera feeds, FarmTalk empowers farmers, researchers, and policymakers to manage plant diseases effectively and sustainably.
The following outlines the overall approach and key components of the FarmTalk project:
The dataset used to finetune the YOLOv8n (nano) model was the New Plant Disease Dataset from Kaggle, which itself was imported from Mohanty et al., 2016. This dataset contains 87K RGB images of healthy and diseased crop leaves categorized into 38 different classes. For this project, the model was trained on 25 of these classes.
Since the dataset is not in the YOLO-specific structure, we needed to preprocess the Kaggle dataset into a YOLO-compatible format. This included creating a respective annotation text file for each image containing the ground truth label and the bounding box. Due to the nature of the images in the Kaggle dataset, we defined the bounding box as the entire image, as it is essentially already segmented. The associated code can be modified to include or exclude specific classes from the dataset.
After preprocessing, we trained the YOLOv8n model using the hyperparameters mentioned in Table 1.
Hyperparameter | |
---|---|
No of Epochs | 10 |
Image Size | 256 |
Batch Size | 64 |
The RAG framework was used to generate remediation strategies based on the retrieved context. The NVIDIAEmbeddings and FAISS were used for semantic search and vector representation of research documents. The RAG framework was integrated with GPT-3.5 Turbo to generate remediation strategies based on the retrieved context.
Optimized prompts were crafted to align user queries with the semantic retrieval pipeline, ensuring accurate and actionable LLM outputs.
A Streamlit-based user interface was designed to support image uploads, live camera feeds, and natural language queries. A snapshot of the interface is shown in Figure 1.
The precision, recall and mAP50 after 10 epochs were 0.993, 0.994, and 0.994 respectively, demonstrating its ability to accurately detect plant diseases in real-time. The following graphs and confusion matrix illustrate the model's performance.
The output after the model was trained on the dataset is shown below:
The system excelled in relevance and fluency, showcasing its ability to generate contextually appropriate and grammatically correct responses. It also maintained high factual accuracy, critical for practical applications in agriculture.
Query: "What are the benefits of crop rotation for reducing early blight in potatoes?"
Model Response: "Crop rotation helps reduce early blight by disrupting pathogen cycles in the soil. Rotating potatoes with non-host crops decreases Alternaria spp., the causative agent."
We utilized a diverse set of queries relevant to plant diseases, along with irrelevant or out-of-context queries, to test the robustness of the system. A total of 50 queries spanning crops like tomato, corn, apple, grape, and potato were evaluated. Responses were rated on a scale from 1 (poor) to 5 (excellent) for each of the metrics. When faced with irrelevant queries, such as "What is the weather in Minneapolis?" or "Who is the president of India?", the model appropriately responded with "I don’t know," highlighting its ability to avoid hallucination.
The following metrics were used for the detailed human evaluation
The average scores achieved during the evaluation were:
Metric | Score |
---|---|
Relevance: | 4.50 |
Fluency: | 4.52 |
Factuality: | 4.44 |
Satisfaction: | 4.52 |
Poor Lighting Conditions: YOLOv8 struggles to detect diseases in images with inadequate lighting or high contrast. For instance, plant leaf discoloration under low light can be mistaken for disease symptoms, leading to false positives or missed detections.
Potential Solution: Employ image preprocessing techniques such as histogram equalization to normalize lighting conditions or augment the dataset with diverse lighting scenarios.
Overlapping Leaves in Dense Foliage: YOLOv8 can misclassify or fail to detect diseases when multiple leaves overlap, obscuring visible disease markers.
Potential Solution: Enhance training data with more examples of overlapping foliage and incorporate higher-resolution images to improve object separation.
Unsatisfactory Responses to Ambiguous Queries: While the framework performs well for direct questions, ambiguous or poorly phrased queries can lead to unsatisfactory responses. For example:
Query: "Are there any resistant potato varieties?"
Model Response: "I don’t know."
This response reflects the model’s inability to retrieve relevant information, even when it exists in the knowledge base.
Potential Solution: Implement query rephrasing mechanisms or use chain-of-thought prompting to guide the model in clarifying ambiguous inputs.
Limitations in Knowledge Base Grounding: The RAG framework sometimes retrieves information that partially answers a query but lacks specificity or depth. This is particularly evident in queries requiring precise scientific details or local context.
Potential Solution: Expand and regularly update the knowledge base with verified, domain-specific literature. Additionally, enhance the embedding model’s semantic understanding with domain-tuned embeddings.
FarmTalk successfully demonstrates the potential of integrating vision models and LLMs for precision agriculture. YOLOv8 provided reliable plant disease detection, while the RAG framework enhanced context-aware remediation. The platform’s user-friendly interface bridges the gap between AI capabilities and real-world agricultural needs, empowering stakeholders to make informed decisions. This project serves as a proof-of-concept for the broader application of AI in agriculture.
The project timeline is as follows:
Week 1: Data Collection & Setup
Gather datasets, particularly crop disease PDFs.
Preprocess data and set up the environment with YOLOv8, GPT-3.5, and RAG.
Clean and organize data for model compatibility.
Week 2: Initial Model Integration Train YOLOv8 on crop disease data for detection accuracy.
Set up RAG framework with GPT-3.5 for remediation responses.
Test basic integration between YOLOv8 and RAG, ensuring accurate label-response flow.
Week 3: Full System Debug & Real-Time Demo Integration. Integrate camera for real-time image input. Run tests and troubleshoot for smooth, real-time operation.
Week 4: Finalize System & Prepare for Presentation