dc.contributor.author | Sharifzadeh, Hamid | |
dc.contributor.author | HajiRassouliha, Amir | |
dc.contributor.author | McLoughlin, I.V. | |
dc.contributor.author | Ardekani, Iman | |
dc.contributor.author | Allen, Jacqueline E. | |
dc.date.accessioned | 2016-04-29T22:33:29Z | |
dc.date.available | 2016-04-29T22:33:29Z | |
dc.date.issued | 2015-12 | |
dc.identifier.uri | https://hdl.handle.net/10652/3352 | |
dc.description.abstract | Computational speech reconstruction algorithms have the ultimate aim of returning natural sounding speech to aphonic and dysphonic individuals. These algorithms can also be used by unimpaired speakers for communicating sensitive or private information. When the glottis loses function due to disease or surgery, aphonic and dysphonic patients retain the power of vocal tract modulation to some degree but they are unable to speak anything more than hoarse whispers without prosthetic aid. While whispering can be seen as a natural and secondary aspect of speech communications for most people, it becomes the primary mechanism of communications for those who have impaired voice production mechanisms, such as laryngectomees.
In this paper, by considering the current limitations of speech reconstruction methods, a novel algorithm for converting whispers to normal speech is proposed and the efficiency of the algorithm is discussed. The proposed algorithm relies upon twin mapping models and makes use of artificially generated whispers (called whisperised speech) to regenerate natural phonated speech from whispers. Through a training-based approach, the mapping models exploit whisperised speech to overcome frame to frame time alignment problem in the speech reconstruction process. | en_NZ |
dc.language.iso | en | en_NZ |
dc.publisher | IEEE Communications Society | en_NZ |
dc.relation.uri | http://ece.adu.ac.ae/ISSPIT2015/index.html | en_NZ |
dc.rights | All rights reserved | en_NZ |
dc.subject | speech reconstruction | en_NZ |
dc.subject | impaired speech | en_NZ |
dc.subject | aphonic patients | en_NZ |
dc.subject | dysphonic patients | en_NZ |
dc.subject | voice production | en_NZ |
dc.subject | computational speech reconstruction algorithms | en_NZ |
dc.title | Phonated speech reconstruction using twin mapping models | en_NZ |
dc.type | Conference Contribution - Paper in Published Proceedings | en_NZ |
dc.rights.holder | IEEE Communications Society | en_NZ |
dc.subject.marsden | 200402 Computational Linguistics | en_NZ |
dc.identifier.bibliographicCitation | Sharifzadeh, H. R., HajiRassouliha, A., McLoughlin, I. V., Ardekani, I. T., & Allen, J. E. (2015, December) Phonated Speech Reconstruction Using Twin Mapping Models. IEEE (Ed.), Proceedings of the 15th IEEE International Symposium on Signal Processing and Information Technology (pp.1-6) | en_NZ |
unitec.institution | Unitec Institute of Technology | en_NZ |
unitec.institution | University of Kent (Kent, United Kingdom) | en_NZ |
unitec.institution | North Shore Hospital (Auckland, N.Z.) | en_NZ |
unitec.publication.title | Proceedings 15th IEEE International Symposium on Signal Processing and Information Technology (ISSPIT 2015) | en_NZ |
unitec.conference.title | 15th IEEE International Symposium on Signal Processing and Information Technology (ISSPIT 2015) | en_NZ |
unitec.conference.org | IEEE Signal Processing Society | en_NZ |
unitec.conference.org | IEEE Computer Society | en_NZ |
unitec.conference.location | Abu Dhabi (United Arab Emirates) | en_NZ |
unitec.conference.sdate | 2015-12-07 | |
unitec.conference.edate | 2015-12-10 | |
unitec.peerreviewed | yes | en_NZ |
dc.contributor.affiliation | Unitec Institute of Technology | en_NZ |
dc.contributor.affiliation | University of Kent | en_NZ |
dc.contributor.affiliation | North Shore Hospital (Auckland N.Z.) | en_NZ |
unitec.identifier.roms | 58296 | en_NZ |
unitec.identifier.roms | 58529 | |
unitec.institution.studyarea | Computing | |