NLP Class Project | Fall 2024 CSCI 5541

Abstract

Leveraging LLMs for Real-Time Plant Disease Detection and Management addresses the critical challenges of plant disease detection in agriculture. Traditional methods are slow, errorprone,and lack scalability, threatening food security and crop yields. FarmTalk combines YOLOv8 for precise disease detection with a Retrieval-Augmented Generation (RAG) framework using NVIDIAEmbeddings and FAISS for semantic search. Integrated with LLMs like GPT-3.5 Turbo, it delivers real-time, actionable remediation recommendations.

Menthodology

Introduction

Agriculture serves as the cornerstone of global food security, directly sustaining billions of lives and contributing significantly to the global economy. However, plant diseases remain one of the most persistent and devastating threats to agricultural productivity. These diseases reduce crop yields, degrade produce quality, and result in significant economic losses. According to research, plant diseases account for a staggering 20-30 Timely detection and management of plant diseases are crucial for minimizing their impact and ensuring sustainable farming practices. Early identification allows farmers to take precise actions to contain infections, reduce crop losses, and maintain yield quality. Traditional methods of plant disease detection, however, are labor-intensive, slow, and prone to errors. These methods typically rely on manual inspection by agricultural experts, which becomes impractical in large-scale farming operations and inaccessible for farmers in remote or resource-limited areas.

FarmTalk: Leveraging LLMs for Real-Time Plant Disease Detection and Management bridges these technological advancements into a unified platform. Combining YOLOv8’s disease detection capabilities with the contextual power of LLMs like GPT-3.5 Turbo, FarmTalk uses a Retrieval- Augmented Generation (RAG) framework for delivering accurate and actionable remediation recommendations. With a user-friendly interface that supports image uploads and live camera feeds, FarmTalk empowers farmers, researchers, and policymakers to manage plant diseases effectively and sustainably.

Approach

The following outlines the overall approach and key components of the FarmTalk project:

YOLOv8 for Disease Detection: Identifies diseases in plant leaves with bounding box annotations.
RAG Framework: Combines FAISS for semantic search and NVIDIAEmbeddings for vector representation of research documents.
LLM Integration: Utilizes GPT-3.5 Turbo to generate remediation strategies based on retrieved context.

Detailed Methodology

YOLOv8

The dataset used to finetune the YOLOv8n (nano) model was the New Plant Disease Dataset from Kaggle, which itself was imported from Mohanty et al., 2016. This dataset contains 87K RGB images of healthy and diseased crop leaves categorized into 38 different classes. For this project, the model was trained on 25 of these classes.

Since the dataset is not in the YOLO-specific structure, we needed to preprocess the Kaggle dataset into a YOLO-compatible format. This included creating a respective annotation text file for each image containing the ground truth label and the bounding box. Due to the nature of the images in the Kaggle dataset, we defined the bounding box as the entire image, as it is essentially already segmented. The associated code can be modified to include or exclude specific classes from the dataset.

After preprocessing, we trained the YOLOv8n model using the hyperparameters mentioned in Table 1.

Table 1: YOLO Hyperparameters
Hyperparameter
No of Epochs	10
Image Size	256
Batch Size	64

RAG Framework

The RAG framework was used to generate remediation strategies based on the retrieved context. The NVIDIAEmbeddings and FAISS were used for semantic search and vector representation of research documents. The RAG framework was integrated with GPT-3.5 Turbo to generate remediation strategies based on the retrieved context.

Prompt Engineering

Optimized prompts were crafted to align user queries with the semantic retrieval pipeline, ensuring accurate and actionable LLM outputs.

Interface Development

A Streamlit-based user interface was designed to support image uploads, live camera feeds, and natural language queries. A snapshot of the interface is shown in Figure 1.

Results

YOLO Evaluation

The precision, recall and mAP50 after 10 epochs were 0.993, 0.994, and 0.994 respectively, demonstrating its ability to accurately detect plant diseases in real-time. The following graphs and confusion matrix illustrate the model's performance.

YOLO Graphs

Normalized Confusion Matrix

The output after the model was trained on the dataset is shown below:

YOLO Inference

RAG Response

The system excelled in relevance and fluency, showcasing its ability to generate contextually appropriate and grammatically correct responses. It also maintained high factual accuracy, critical for practical applications in agriculture.

Query: "What are the benefits of crop rotation for reducing early blight in potatoes?"
Model Response: "Crop rotation helps reduce early blight by disrupting pathogen cycles in the soil. Rotating potatoes with non-host crops decreases Alternaria spp., the causative agent."

Evaluation

We utilized a diverse set of queries relevant to plant diseases, along with irrelevant or out-of-context queries, to test the robustness of the system. A total of 50 queries spanning crops like tomato, corn, apple, grape, and potato were evaluated. Responses were rated on a scale from 1 (poor) to 5 (excellent) for each of the metrics. When faced with irrelevant queries, such as "What is the weather in Minneapolis?" or "Who is the president of India?", the model appropriately responded with "I don’t know," highlighting its ability to avoid hallucination.

The following metrics were used for the detailed human evaluation

Relevance: Measures how well the model’s responses align with the query’s intent.
Fluency: Evaluates the grammatical correctness and readability of the responses.
Factuality: Assesses the accuracy of the information provided in the response.
Satisfaction: Reflects overall user satisfaction with the response’s quality.

The average scores achieved during the evaluation were:

Metric	Score
Relevance:	4.50
Fluency:	4.52
Factuality:	4.44
Satisfaction:	4.52

These results indicate the model’s strong performance in delivering precise, coherent, and factually accurate responses.

Limitations

YOLO Limitation

Poor Lighting Conditions: YOLOv8 struggles to detect diseases in images with inadequate lighting or high contrast. For instance, plant leaf discoloration under low light can be mistaken for disease symptoms, leading to false positives or missed detections.
Potential Solution: Employ image preprocessing techniques such as histogram equalization to normalize lighting conditions or augment the dataset with diverse lighting scenarios.

Overlapping Leaves in Dense Foliage: YOLOv8 can misclassify or fail to detect diseases when multiple leaves overlap, obscuring visible disease markers.
Potential Solution: Enhance training data with more examples of overlapping foliage and incorporate higher-resolution images to improve object separation.

RAG Limitation

Unsatisfactory Responses to Ambiguous Queries: While the framework performs well for direct questions, ambiguous or poorly phrased queries can lead to unsatisfactory responses. For example:
Query: "Are there any resistant potato varieties?"
Model Response: "I don’t know."
This response reflects the model’s inability to retrieve relevant information, even when it exists in the knowledge base.
Potential Solution: Implement query rephrasing mechanisms or use chain-of-thought prompting to guide the model in clarifying ambiguous inputs.

Limitations in Knowledge Base Grounding: The RAG framework sometimes retrieves information that partially answers a query but lacks specificity or depth. This is particularly evident in queries requiring precise scientific details or local context.
Potential Solution: Expand and regularly update the knowledge base with verified, domain-specific literature. Additionally, enhance the embedding model’s semantic understanding with domain-tuned embeddings.

Conclusion

FarmTalk successfully demonstrates the potential of integrating vision models and LLMs for precision agriculture. YOLOv8 provided reliable plant disease detection, while the RAG framework enhanced context-aware remediation. The platform’s user-friendly interface bridges the gap between AI capabilities and real-world agricultural needs, empowering stakeholders to make informed decisions. This project serves as a proof-of-concept for the broader application of AI in agriculture.

Future Work

Expand the dataset to include more plant species and disease categories.
Integrate multi-modal data (e.g., hyperspectral and thermal imaging) for robust detection.
Transition the system to edge devices for scalability in remote agricultural areas.

Timeline

The project timeline is as follows:
Week 1: Data Collection & Setup Gather datasets, particularly crop disease PDFs. Preprocess data and set up the environment with YOLOv8, GPT-3.5, and RAG. Clean and organize data for model compatibility.
Week 2: Initial Model Integration Train YOLOv8 on crop disease data for detection accuracy. Set up RAG framework with GPT-3.5 for remediation responses. Test basic integration between YOLOv8 and RAG, ensuring accurate label-response flow.
Week 3: Full System Debug & Real-Time Demo Integration. Integrate camera for real-time image input. Run tests and troubleshoot for smooth, real-time operation.
Week 4: Finalize System & Prepare for Presentation

FarmTalk: A Real Time Platform for Plant Disease Identification and Management

Fall 2024 CSCI 5541 NLP: Class Project - University of Minnesota

NLP Nexus