In this paper, we review the datasets of emotional speech publicly available and their usability for state of the art speech synthesis. This is conditioned by several characteristics of these datasets: the quality of the recordings, the quantity of the data and the emotional content captured contained in the data. We then present a dataset that was recorded based on the observation of the needs in this area. It contains data for male and female actors in English and a male actor in French. The database covers five emotion classes so it could be suitable to build synthesis and voice transformation systems with the potential to control the emotional dimension.
CITATION STYLE
Tits, N., El Haddad, K., & Dutoit, T. (2020). Emotional speech datasets for english speech synthesis purpose: A review. In Advances in Intelligent Systems and Computing (Vol. 1037, pp. 61–66). Springer Verlag. https://doi.org/10.1007/978-3-030-29516-5_6
Mendeley helps you to discover research relevant for your work.