Semantic role labeling of speech transcripts without sentence boundaries

Niraj Shrestha; Marie Francine Moens

Conference Proceedings

Semantic role labeling of speech transcripts without sentence boundaries

Lecture Notes in Computer Science (2018) 11107 LNAI 379-387

DOI: 10.1007/978-3-030-00794-2_41

0Citations

2Readers

Get full text

Abstract

Speech data is an extremely rich and important source of information. However, we lack suitable methods for the semantic annotation of speech data. For instance, semantic role labeling (SRL) of speech that has been transcribed by an automated speech recognition (ASR) system is still an unsolved problem. SRL of ASR data is difficult and complex due to the absence of sentence boundaries, punctuation, grammar errors, words that are wrongly transcribed, and word deletions and insertions. In this paper we propose a novel approach to SRL of ASR data based on the following idea: (1) train the SRL system on data segmented into frames, where each frame consists of a predicate and its semantic roles without considering sentence boundaries; (2) label it with the semantics of PropBank roles; and to assist the above (3) train a part-of-speech (POS) tagger to work on noisy and error prone ASR data. Experiments with the OntoNotes corpus show improvements compared to the state-of-the-art SRL applied on ASR data.

Author supplied keywords

Cite

CITATION STYLE

APA

Shrestha, N., & Moens, M. F. (2018). Semantic role labeling of speech transcripts without sentence boundaries. In Lecture Notes in Computer Science (Vol. 11107 LNAI, pp. 379–387). Springer Verlag. https://doi.org/10.1007/978-3-030-00794-2_41

Semantic role labeling of speech transcripts without sentence boundaries

Abstract

Author supplied keywords

Cite

Register to see more suggestions