A generative AI approach for blind spot awareness: Real-Time image compression and reconstruction using DDPM model

Loading...
Thumbnail Image

Supplementary material

Other Title

Authors

Peiris, Thanthrihewage Praveen Maleesha

Author ORCID Profiles (clickable)

Degree

Master of Applied Technologies (Computing)

Grantor

Unitec, Te Pūkenga – New Zealand Institute of Skills and Technology

Date

2025

Supervisors

Liu, William
Song, Lei

Type

Masters Thesis

Ngā Upoko Tukutuku (Māori subject headings)

Keyword

traffic safety
traffic accidents
real-time data processing
computer vision
neural networks

Citation

Peiris, T.P.M. (2025). A generative AI approach for blind spot awareness: Real-Time image compression and reconstruction using DDPM model (Unpublished document submitted in partial fulfilment of the requirements for the degree of Master of Applied Technologies (Computing)). Unitec, Te Pūkenga - New Zealand Institute of Skills and Technology https://hdl.handle.net/10652/6952

Abstract

RESEACH QUESTIONS • How can a generative AI model be used to compress visual data from CCTV footage to reduce its size while maintaining crucial image quality for blind spot detection? • How does the use of AI-based image compression affect network latency when transmitting CCTV images for real-time monitoring? • In what ways does real-time, low-latency transmission of compressed CCTV images improve the ability of drivers to detect potential hazards in blind spots? • What privacy and security considerations must be addressed when transmitting compressed visual data from CCTV systems for blind spot detection? ABSTRACT With the increased number of vehicles and pedestrians on the road, blind spots have become a significant factor in road traffic accidents as they lead to reduced visibility, thus increasing the chance of collision. This thesis explores the application of generative AI to detect and mitigate blind spots, enhancing road safety for drivers and pedestrians. The primary objective of this thesis is to develop a generative AI-based model capable of compressing images like individual images extracted from CCTV video streams to send it over networks with less latency and reconstructing these images in the driver end in real-time, thus building a model capable of identifying hidden hazards and predicting potential risks in real-time. Using advanced deep learning techniques like Denoising Diffusion Probabilistic Models (DDPM) with UNET architecture, this system is trained on diverse datasets such as CIFAR10 and Unnamed Aerial Vehicle (UAV), encompassing various vehicle types, environments, and traffic conditions. The proposed model was initially tested on the CIFAR dataset, achieving the desired results, including accurate image reconstruction through reverse diffusion. Subsequently, the model was tested on the UAV dataset to evaluate its performance in scenarios closer to real-world applications, such as higher-resolution data typically expected from CCTV footage. While the model demonstrated potential, some challenges arose due to the UAV dataset's higher pixel resolution, requiring the downscaling of images to 32x32 pixels, which limited its performance. This work underscores the need for further optimisation to handle high-resolution data effectively, paving its way for applications in real-world systems.

Publisher

Link to ePress publication

DOI

Copyright holder

Author

Copyright notice

All rights reserved

Copyright license

Available online at