Hybrid YOLOv9-DETR model for strawberry disease detection: A non-end-to-end object detection approach
Loading...
Supplementary material
Other Title
Authors
Ghorbani, Amirmasoud
Author ORCID Profiles (clickable)
Degree
Master of Applied Technologies (Computing)
Grantor
Unitec, Te Pūkenga – New Zealand Institute of Skills and Technology
Date
2024
Supervisors
Ardekani, Iman
Bell, Jamie
Bell, Jamie
Type
Masters Thesis
Ngā Upoko Tukutuku (Māori subject headings)
Keyword
strawberry plants
plant disease
image processing
modelling
neural networks
pattern recognition systems in agriculture
plant disease
image processing
modelling
neural networks
pattern recognition systems in agriculture
ANZSRC Field of Research Code (2020)
Citation
Ghorbani, A. (2024). Hybrid YOLOv9-DETR model for strawberry disease detection: A non-end-to-end object detection approach (Unpublished document submitted in partial fulfilment of the requirements for the degree of Master of Applied Technologies (Computing)). Unitec, Te Pūkenga - New Zealand Institute of Skills and Technology
https://hdl.handle.net/10652/6743
Abstract
Detection of plant diseases is critical for both agricultural productivity and food security. In this thesis, we propose a hybrid object detection system that combines YOLOv9 and DETR for the first time to identify disease symptoms in strawberry plants. YOLOv9, a state-of-the-art detection model known for its speed and efficient feature extraction, was fine-tuned on a custom dataset for disease symptom detection. DETR, a transformer-based architecture, was trained separately to refine predictions using attention mechanisms and global context understanding.
This study explored four sequential experiments to determine the most effective way to integrate YOLOv9 and DETR. Initially, the goal was to develop a fully end-to-end model, where both architectures would be optimized jointly. To achieve this, a feature bridging mechanism was introduced to align YOLOv9’s outputs with DETR’s input format. However, architectural incompatibilities and computational constraints revealed that end-to-end training was impractical, as YOLOv9’s convolutional feature maps did not naturally align with DETR’s transformer-based processing, leading to unstable training dynamics.
Recognizing these challenges, the research pivoted toward a non-end-to-end hybrid inference approach, where both models were trained separately and their outputs were merged at inference time. Instead of a feature bridging module, a novel bounding-box selection strategy was implemented to unify the results of both models. By calculating IoU values between potentially overlapping YOLOv9 and DETR detections, bounding boxes exceeding a 0.5 IoU threshold were filtered to retain only the most confident prediction. Additionally, when one model missed an object entirely, its bounding box was taken from the other model to ensure more comprehensive coverage. This method effectively reduced redundancy and leveraged the complementary strengths of YOLOv9’s fast detection and DETR’s refined bounding box alignment.
The final hybrid inference strategy achieved a mAP@0.5 of 0.96, demonstrating high detection sensitivity. However, dataset imbalance and computational resource limitations impacted overall generalization and accuracy. Despite these constraints, this research lays a foundation for future hybrid architectures, emphasizing the importance of dataset balancing, feature alignment strategies, and robust model integration.
This study suggests several potential enhancements, including extended training cycles, dataset expansion, and real-time deployment for agricultural applications. The proposed YOLOv9-DETR hybrid system marks a key milestone toward automated plant disease monitoring, contributing to sustainable and intelligent agricultural practices.
Publisher
Permanent link
Link to ePress publication
DOI
Copyright holder
Author
Copyright notice
All rights reserved
