An analysis of the RNN-based spoken term detection training

2Citations
Citations of this article
4Readers
Mendeley users who have this article in their library.
Get full text

Abstract

This paper studies the training process of the recurrent neural networks used in the spoken term detection (STD) task. The method used in the paper employ two jointly trained Siamese networks using unsupervised data. The grapheme representation of a searched term and the phoneme realization of a putative hit are projected into the pronunciation embedding space using such networks. The score is estimated as relative distance of these embeddings. The paper studies the influence of different loss functions, amount of unsupervised data and the meta-parameters on the performance of the STD system.

Cite

CITATION STYLE

APA

Švec, J., Šmídl, L., & Psutka, J. V. (2017). An analysis of the RNN-based spoken term detection training. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10458 LNAI, pp. 119–129). Springer Verlag. https://doi.org/10.1007/978-3-319-66429-3_11

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free