This paper describes an application of Gaussian process regression (GPR) to parametric speech synthesis. GPR enables us to predict synthetic speech parameters by utilizing exemplars of training speech data directly without converting the acoustic features of training data into too small number of model parameters thanks to nonparametric Bayesian regression. However, GPR inherently requires high computational cost and resources. In this paper, to alleviate this problem, we incorporate local and global sparse Gaussian process approximation into the statistical speech synthesis framework, and investigate trade-off between computational cost and speech synthesis performance through experiments. Moreover, we examine the way of choosing pseudo data set used for the sparse GP approximation.
CITATION STYLE
Koriyama, T., Nose, T., & Kobayashi, T. (2014). Parametric speech synthesis using local and global sparse Gaussian processes. In IEEE International Workshop on Machine Learning for Signal Processing, MLSP. IEEE Computer Society. https://doi.org/10.1109/MLSP.2014.6958921
Mendeley helps you to discover research relevant for your work.