A generative AI approach for blind spot awareness: Real-Time image compression and reconstruction using DDPM model
Loading...
Supplementary material
Other Title
Authors
Peiris, Thanthrihewage Praveen Maleesha
Author ORCID Profiles (clickable)
Degree
Master of Applied Technologies (Computing)
Grantor
Unitec, Te Pūkenga – New Zealand Institute of Skills and Technology
Date
2025
Supervisors
Liu, William
Song, Lei
Song, Lei
Type
Masters Thesis
Ngā Upoko Tukutuku (Māori subject headings)
Keyword
traffic safety
traffic accidents
real-time data processing
computer vision
neural networks
traffic accidents
real-time data processing
computer vision
neural networks
ANZSRC Field of Research Code (2020)
Citation
Peiris, T.P.M. (2025). A generative AI approach for blind spot awareness: Real-Time image compression and reconstruction using DDPM model (Unpublished document submitted in partial fulfilment of the requirements for the degree of Master of Applied Technologies (Computing)). Unitec, Te Pūkenga - New Zealand Institute of Skills and Technology
https://hdl.handle.net/10652/6952
Abstract
RESEACH QUESTIONS
• How can a generative AI model be used to compress visual data from CCTV footage to reduce its size while maintaining crucial image quality for blind spot detection?
• How does the use of AI-based image compression affect network latency when transmitting CCTV images for real-time monitoring?
• In what ways does real-time, low-latency transmission of compressed CCTV images improve the ability of drivers to detect potential hazards in blind spots?
• What privacy and security considerations must be addressed when transmitting compressed visual data from CCTV systems for blind spot detection?
ABSTRACT
With the increased number of vehicles and pedestrians on the road, blind spots have become a significant factor in road traffic accidents as they lead to reduced visibility, thus increasing the chance of collision. This thesis explores the application of generative AI to detect and mitigate blind spots, enhancing road safety for drivers and pedestrians. The primary objective of this thesis is to develop a generative AI-based model capable of compressing images like individual images extracted from CCTV video streams to send it over networks with less latency and reconstructing these images in the driver end in real-time, thus building a model capable of identifying hidden hazards and predicting potential risks in real-time.
Using advanced deep learning techniques like Denoising Diffusion Probabilistic Models (DDPM) with UNET architecture, this system is trained on diverse datasets such as CIFAR10 and Unnamed Aerial Vehicle (UAV), encompassing various vehicle types, environments, and traffic conditions.
The proposed model was initially tested on the CIFAR dataset, achieving the desired results, including accurate image reconstruction through reverse diffusion. Subsequently, the model was tested on the UAV dataset to evaluate its performance in scenarios closer to real-world applications, such as higher-resolution data typically expected from CCTV footage. While the model demonstrated potential, some challenges arose due to the UAV dataset's higher pixel resolution, requiring the downscaling of images to 32x32 pixels, which limited its performance. This work underscores the need for further optimisation to handle high-resolution data effectively, paving its way for applications in real-world systems.
Publisher
Permanent link
Link to ePress publication
DOI
Copyright holder
Author
Copyright notice
All rights reserved
