Semantic role labeling of speech transcripts without sentence boundaries

0Citations
Citations of this article
2Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Speech data is an extremely rich and important source of information. However, we lack suitable methods for the semantic annotation of speech data. For instance, semantic role labeling (SRL) of speech that has been transcribed by an automated speech recognition (ASR) system is still an unsolved problem. SRL of ASR data is difficult and complex due to the absence of sentence boundaries, punctuation, grammar errors, words that are wrongly transcribed, and word deletions and insertions. In this paper we propose a novel approach to SRL of ASR data based on the following idea: (1) train the SRL system on data segmented into frames, where each frame consists of a predicate and its semantic roles without considering sentence boundaries; (2) label it with the semantics of PropBank roles; and to assist the above (3) train a part-of-speech (POS) tagger to work on noisy and error prone ASR data. Experiments with the OntoNotes corpus show improvements compared to the state-of-the-art SRL applied on ASR data.

Author supplied keywords

Cite

CITATION STYLE

APA

Shrestha, N., & Moens, M. F. (2018). Semantic role labeling of speech transcripts without sentence boundaries. In Lecture Notes in Computer Science (Vol. 11107 LNAI, pp. 379–387). Springer Verlag. https://doi.org/10.1007/978-3-030-00794-2_41

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free