A survey of recent DNN architectures on the TIMIT phone recognition task

Josef Michálek; Jan Vaněk

Conference Proceedings

A survey of recent DNN architectures on the TIMIT phone recognition task

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2018) 11107 LNAI 436-444

DOI: 10.1007/978-3-030-00794-2_47

11Citations

22Readers

Get full text

Abstract

In this survey paper, we have evaluated several recent deep neural network (DNN) architectures on a TIMIT phone recognition task. We chose the TIMIT corpus due to its popularity and broad availability in the community. It also simulates a low-resource scenario that is helpful in minor languages. Also, we prefer the phone recognition task because it is much more sensitive to an acoustic model quality than a large vocabulary continuous speech recognition (LVCSR) task. In recent years, many DNN published papers reported results on TIMIT. However, the reported phone error rates (PERs) were often much higher than a PER of a simple feed-forward (FF) DNN. That was the main motivation of this paper: To provide a baseline DNNs with open-source scripts to easily replicate the baseline results for future papers with lowest possible PERs. According to our knowledge, the best-achieved PER of this survey is better than the best-published PER to date.

Author supplied keywords

Cite

CITATION STYLE

APA

Michálek, J., & Vaněk, J. (2018). A survey of recent DNN architectures on the TIMIT phone recognition task. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11107 LNAI, pp. 436–444). Springer Verlag. https://doi.org/10.1007/978-3-030-00794-2_47

A survey of recent DNN architectures on the TIMIT phone recognition task

Abstract

Author supplied keywords

Cite

Register to see more suggestions