Humanizing AI chatbots: The role of speech emotion recognition with deep learning

Loading...
Thumbnail Image

Supplementary material

Other Title

Authors

Barmada, Bashar
Kannangara , M.
Ramirez-Prado, Guillermo
Varastehpour, Soheil
Shakiba, Masoud

Author ORCID Profiles (clickable)

Degree

Grantor

Date

2025

Supervisors

Type

Conference Contribution - Paper in Published Proceedings

Ngā Upoko Tukutuku (Māori subject headings)

Keyword

chatbots
human-computer interaction
emotion recognition
artificial emotional intelligence
speech processing systems
artificial intelligence
natural language processing (computer science)

Citation

Barmada, B., Kannangara, M., Ramirez-Prado, G., Pour, S., & Shakiba, M. (2025). Humanizing AI chatbots: The role of speech emotion recognition with deep learningg. In Khalid S. Soliman (Ed.), 45th IBIMA Computer Science Conference (pp. 1-9). https://hdl.handle.net/10652/7105

Abstract

This research focuses on integrating Speech Emotion Recognition (SER) with AI chatbots to create a system that is more emotionally intelligent and responsive. Using advanced deep learning techniques such as Convolutional Neural Networks (CNNs), the study enhances the accuracy and robustness of SER models in detecting emotions from speech. The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) serves as the primary dataset, with data augmentation techniques such as noise injection, speed variation, and pitch shifting applied to improve model performance. Key features such as Mel-Frequency Cepstral Coefficients (MFCC), Mel Spectrogram, and Zero Crossing Rate (ZCR) are extracted to improve the analysis. The study uses regularization techniques, including Batch Normalization and L2 Regularization, to prevent overfitting. Eight emotion classes are evaluated namely, neutral, calm, happy, sad, angry, fear, disgust and surprise. Experimental results show significant improvements with the best test accuracy reaching 87.5%, outperforming previous studies. Visualized training history demonstrates the model’s learning behavior and generalization capabilities. The findings highlight the potential of SER-enhanced chatbots in applications such as customer service and mental health support by enabling empathetic interactions.

Publisher

IBIMA Publishing

Link to ePress publication

DOI

Copyright holder

Authors

Copyright notice

All rights reserved

Copyright license

This item appears in: