YOLOv9, CNN, and LSTM-based solution for automated guinea pig behavior recognition

Loading...
Thumbnail Image

Supplementary material

Other Title

Authors

Cui, Xiupeng

Author ORCID Profiles (clickable)

Degree

Masster of Applied Technologies (Computing)

Grantor

Unitec, Te Pūkenga – New Zealand Institute of Skills and Technology

Date

2025

Supervisors

Song, Lei
Ardekani, Iman

Type

Masters Thesis

Ngā Upoko Tukutuku (Māori subject headings)

Keyword

guinea pigs (Cavia porcellus)
animal behaviour and welfare
deep learning
computer vision
real-time data processing
machine learning

Citation

Cui, X. (2025). YOLOv9, CNN, and LSTM-based solution for automated guinea pig behavior recognition (Unpublished document submitted in partial fulfilment of the requirements for the degree of Master of Applied Technologies (Computing)). Unitec, Te Pūkenga - New Zealand Institute of Skills and Technology https://hdl.handle.net/10652/6953

Abstract

RESEARCH QUESTIONS • How effective and reliable is the object detection component of the proposed behavior recognition pipeline in detecting guinea pigs across various video scenarios? • How well does the behavior classification model perform in identifying key guinea pig behaviors based on temporal frame sequences? • How accurately and efficiently does the proposed pipeline perform end-to-end guinea pig behavior recognition from raw video input? ABSTRACT Understanding and monitoring guinea pig (Cavia porcellus) behavior is essential for research and animal welfare assessment. However, current studies on guinea pig behavior recognition heavily rely on manual observation, which is time-consuming and lacks scalability. To address this limitation, this study proposes and implements an automated end-to-end guinea pig behavior recognition system based on a combination of YOLOv9 for object detection, BoT-SORT for identity tracking, and a CNN+LSTM architecture for behavior classification. The system first applies YOLOv9 and BoT-SORT to perform object detection and ID-based tracking of guinea pigs in videos. It then constructs fixed-length frame sequences for each individual guinea pig based on their assigned IDs. Subsequently, a CNN+LSTM model with ResNet50 as the CNN backbone classifies behaviors by extracting spatial and temporal features from the frame sequences. A custom behavior dataset containing 472 video clips covering 7 behavior categories was developed for training. Additionally, a test set consisting of 12 fully annotated original videos with a total of 1130 behavior instances was created for evaluation. The proposed behavior recognition system achieves an overall accuracy of 93% on the test videos. This study demonstrates the feasibility and effectiveness of using deep learning methods for automated guinea pig behavior recognition. This system provides a valuable reference for future research in animal behavior recognition, welfare assessment, and related fields.

Publisher

Link to ePress publication

DOI

Copyright holder

Author

Copyright notice

All rights reserved

Copyright license

Available online at