Combining continuous word representation and prosodic features for ASR error prediction

Sahar Ghannay; Yannick Estève; Nathalie Camelin; Camille Dutrey; Fabian Santiago; Martine Adda-Decker

Conference Proceedings

Combining continuous word representation and prosodic features for ASR error prediction

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2015) 9449 84-95

DOI: 10.1007/978-3-319-25789-1_9

15Citations

6Readers

Get full text

Abstract

Recent advances in continuous word representation have been successfully used in several natural language processing tasks. This paper focuses on error prediction in Automatic Speech Recognition (ASR) outputs and proposes to investigate the use of continuous word representation (word embeddings) within a neural network architecture. The main contribution of this paper is about word embeddings combination: several combination approaches are proposed in order to take advantage of their complementarity. The use of prosodic features, in addition to classical syntactic ones, is evaluated. Experiments are made on automatic transcriptions generated by the LIUM ASR system applied on the ETAPE corpus. They show that the proposed neural architecture, using an effective continuous word representation combination and prosodic features as additional features, outperforms significantly state-of-the-art approach based on the use of Conditional Random Fields. Last, the proposed system produces a well calibrated confidence measure, evaluated in terms of Normalized Cross Entropy.

Author supplied keywords

Cite

CITATION STYLE

APA

Ghannay, S., Estève, Y., Camelin, N., Dutrey, C., Santiago, F., & Adda-Decker, M. (2015). Combining continuous word representation and prosodic features for ASR error prediction. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9449, pp. 84–95). Springer Verlag. https://doi.org/10.1007/978-3-319-25789-1_9

Combining continuous word representation and prosodic features for ASR error prediction

Abstract

Author supplied keywords

Cite

Register to see more suggestions