Attentional parallel RNNs for generating punctuation in transcribed speech

Alp Öktem; Mireia Farrús; Leo Wanner

Conference Proceedings

Attentional parallel RNNs for generating punctuation in transcribed speech

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2017) 10583 LNAI 131-142

DOI: 10.1007/978-3-319-68456-7_11

16Citations

24Readers

Get full text

Abstract

Until very recently, the generation of punctuation marks for automatic speech recognition (ASR) output has been mostly done by looking at the syntactic structure of the recognized utterances. Prosodic cues such as breaks, speech rate, pitch intonation that influence placing of punctuation marks on speech transcripts have been seldom used. We propose a method that uses recurrent neural networks, taking prosodic and lexical information into account in order to predict punctuation marks for raw ASR output. Our experiments show that an attention mechanism over parallel sequences of prosodic cues aligned with transcribed speech improves accuracy of punctuation generation.

Author supplied keywords

Cite

CITATION STYLE

APA

Öktem, A., Farrús, M., & Wanner, L. (2017). Attentional parallel RNNs for generating punctuation in transcribed speech. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10583 LNAI, pp. 131–142). Springer Verlag. https://doi.org/10.1007/978-3-319-68456-7_11

Attentional parallel RNNs for generating punctuation in transcribed speech

Abstract

Author supplied keywords

Cite

Register to see more suggestions