Convolutional neural network for refinement of speaker adaptation transformation

1Citations
Citations of this article
5Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The aim of this work is to propose a refinement of the shift-MLLR (shift Maximum Likelihood Linear Regression) adaptation of an acoustics model in the case of limited amount of adaptation data, which can lead to ill-conditioned transformations matrices. We try to suppress the influence of badly estimated transformation parameters utilizing the Artificial Neural Network (ANN), especially Convolutional Neural Network (CNN) with bottleneck layer on the end. The badly estimated shift-MLLR transformation is propagated through an ANN (suitably trained beforehand), and the output of the net is used as the new refined transformation. To train the ANN the well and the badly conditioned shift-MLLR transformations are used as outputs and inputs of ANN, respectively.

Cite

CITATION STYLE

APA

Zajíc, Z., Zelinka, J., Vanĕk, J., & Müller, L. (2014). Convolutional neural network for refinement of speaker adaptation transformation. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8773, pp. 161–168). Springer Verlag. https://doi.org/10.1007/978-3-319-11581-8_20

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free