Humanizing AI chatbots: The role of speech emotion recognition with deep learning
Loading...
Supplementary material
Other Title
Authors
Barmada, Bashar
Kannangara , M.
Ramirez-Prado, Guillermo
Varastehpour, Soheil
Shakiba, Masoud
Kannangara , M.
Ramirez-Prado, Guillermo
Varastehpour, Soheil
Shakiba, Masoud
Author ORCID Profiles (clickable)
Degree
Grantor
Date
2025
Supervisors
Type
Conference Contribution - Paper in Published Proceedings
Ngā Upoko Tukutuku (Māori subject headings)
Keyword
chatbots
human-computer interaction
emotion recognition
artificial emotional intelligence
speech processing systems
artificial intelligence
natural language processing (computer science)
human-computer interaction
emotion recognition
artificial emotional intelligence
speech processing systems
artificial intelligence
natural language processing (computer science)
ANZSRC Field of Research Code (2020)
Citation
Barmada, B., Kannangara, M., Ramirez-Prado, G., Pour, S., & Shakiba, M. (2025). Humanizing AI chatbots: The role of speech emotion recognition with deep learningg. In Khalid S. Soliman (Ed.), 45th IBIMA Computer Science Conference (pp. 1-9).
https://hdl.handle.net/10652/7105
Abstract
This research focuses on integrating Speech Emotion Recognition (SER) with AI chatbots to create a system that is more emotionally intelligent and responsive. Using advanced deep learning techniques such as Convolutional Neural Networks (CNNs), the study enhances the accuracy and robustness of SER models in detecting emotions from speech. The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) serves as the primary dataset, with data augmentation techniques such as noise injection, speed variation, and pitch shifting applied to improve model performance. Key features such as Mel-Frequency Cepstral Coefficients (MFCC), Mel Spectrogram, and Zero Crossing Rate (ZCR) are extracted to improve the analysis. The study uses regularization techniques, including Batch Normalization and L2 Regularization, to prevent overfitting. Eight emotion classes are evaluated namely, neutral, calm, happy, sad, angry, fear, disgust and surprise. Experimental results show significant improvements with the best test accuracy reaching 87.5%, outperforming previous studies. Visualized training history demonstrates the model’s learning behavior and generalization capabilities. The findings highlight the potential of SER-enhanced chatbots in applications such as customer service and mental health support by enabling empathetic interactions.
Publisher
IBIMA Publishing
Permanent link
Link to ePress publication
DOI
Copyright holder
Authors
Copyright notice
All rights reserved
