Toward a better understanding of deep neural network based acoustic modelling: An empirical investigation

4Citations
Citations of this article
17Readers
Mendeley users who have this article in their library.

Abstract

Recently, deep neural networks (DNNs) have outperformed traditional acoustic models on a variety of speech recognition benchmarks. However, due to system differences across research groups, although a tremendous breadth and depth of related work has been established, it is still not easy to assess the performance improvements of a particular architectural variant from examining the literature when building DNN acoustic models. Our work aims to uncover which variations among baseline systems are most relevant for automatic speech recognition (ASR) performance via a series of systematic tests on the limits of the major architectural choices. By holding all the other components fixed, we are able to explore the design and training decisions without being confounded by the other influencing factors. Our experiment results suggest that a relatively simple DNN architecture and optimization technique produces strong results. These findings, along with previous work, not only help build a better understanding towards why DNN acoustic models perform well or how they might be improved, but also help establish a set of best practices for new speech corpora and language understanding task variants.

Cite

CITATION STYLE

APA

Wang, X., Wang, L., Chen, J., & Wu, L. (2016). Toward a better understanding of deep neural network based acoustic modelling: An empirical investigation. In 30th AAAI Conference on Artificial Intelligence, AAAI 2016 (pp. 2173–2179). AAAI press. https://doi.org/10.1609/aaai.v30i1.10256

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free