An analysis of the RNN-based spoken term detection training

Jan Švec; Luboš Šmídl; Josef V. Psutka

Conference Proceedings

An analysis of the RNN-based spoken term detection training

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2017) 10458 LNAI 119-129

DOI: 10.1007/978-3-319-66429-3_11

2Citations

4Readers

Get full text

Abstract

This paper studies the training process of the recurrent neural networks used in the spoken term detection (STD) task. The method used in the paper employ two jointly trained Siamese networks using unsupervised data. The grapheme representation of a searched term and the phoneme realization of a putative hit are projected into the pronunciation embedding space using such networks. The score is estimated as relative distance of these embeddings. The paper studies the influence of different loss functions, amount of unsupervised data and the meta-parameters on the performance of the STD system.

Author supplied keywords

Cite

CITATION STYLE

APA

Švec, J., Šmídl, L., & Psutka, J. V. (2017). An analysis of the RNN-based spoken term detection training. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10458 LNAI, pp. 119–129). Springer Verlag. https://doi.org/10.1007/978-3-319-66429-3_11

An analysis of the RNN-based spoken term detection training

Abstract

Author supplied keywords

Cite

Register to see more suggestions