Hybrid YOLOv9-DETR model for strawberry disease detection: A non-end-to-end object detection approach

Loading...
Thumbnail Image

Supplementary material

Other Title

Authors

Ghorbani, Amirmasoud

Author ORCID Profiles (clickable)

Degree

Master of Applied Technologies (Computing)

Grantor

Unitec, Te Pūkenga – New Zealand Institute of Skills and Technology

Date

2024

Supervisors

Ardekani, Iman
Bell, Jamie

Type

Masters Thesis

Ngā Upoko Tukutuku (Māori subject headings)

Keyword

strawberry plants
plant disease
image processing
modelling
neural networks
pattern recognition systems in agriculture

Citation

Ghorbani, A. (2024). Hybrid YOLOv9-DETR model for strawberry disease detection: A non-end-to-end object detection approach (Unpublished document submitted in partial fulfilment of the requirements for the degree of Master of Applied Technologies (Computing)). Unitec, Te Pūkenga - New Zealand Institute of Skills and Technology https://hdl.handle.net/10652/6743

Abstract

Detection of plant diseases is critical for both agricultural productivity and food security. In this thesis, we propose a hybrid object detection system that combines YOLOv9 and DETR for the first time to identify disease symptoms in strawberry plants. YOLOv9, a state-of-the-art detection model known for its speed and efficient feature extraction, was fine-tuned on a custom dataset for disease symptom detection. DETR, a transformer-based architecture, was trained separately to refine predictions using attention mechanisms and global context understanding. This study explored four sequential experiments to determine the most effective way to integrate YOLOv9 and DETR. Initially, the goal was to develop a fully end-to-end model, where both architectures would be optimized jointly. To achieve this, a feature bridging mechanism was introduced to align YOLOv9’s outputs with DETR’s input format. However, architectural incompatibilities and computational constraints revealed that end-to-end training was impractical, as YOLOv9’s convolutional feature maps did not naturally align with DETR’s transformer-based processing, leading to unstable training dynamics. Recognizing these challenges, the research pivoted toward a non-end-to-end hybrid inference approach, where both models were trained separately and their outputs were merged at inference time. Instead of a feature bridging module, a novel bounding-box selection strategy was implemented to unify the results of both models. By calculating IoU values between potentially overlapping YOLOv9 and DETR detections, bounding boxes exceeding a 0.5 IoU threshold were filtered to retain only the most confident prediction. Additionally, when one model missed an object entirely, its bounding box was taken from the other model to ensure more comprehensive coverage. This method effectively reduced redundancy and leveraged the complementary strengths of YOLOv9’s fast detection and DETR’s refined bounding box alignment. The final hybrid inference strategy achieved a mAP@0.5 of 0.96, demonstrating high detection sensitivity. However, dataset imbalance and computational resource limitations impacted overall generalization and accuracy. Despite these constraints, this research lays a foundation for future hybrid architectures, emphasizing the importance of dataset balancing, feature alignment strategies, and robust model integration. This study suggests several potential enhancements, including extended training cycles, dataset expansion, and real-time deployment for agricultural applications. The proposed YOLOv9-DETR hybrid system marks a key milestone toward automated plant disease monitoring, contributing to sustainable and intelligent agricultural practices.

Publisher

Link to ePress publication

DOI

Copyright holder

Author

Copyright notice

All rights reserved

Copyright license

Available online at