Improving disfluency detection by self-training a self-attentive model

Paria Jamshid Lou; Mark Johnson

Conference ProceedingsOPEN ACCESS

Improving disfluency detection by self-training a self-attentive model

Proceedings of the Annual Meeting of the Association for Computational Linguistics (2020) 3754-3763

DOI: 10.18653/v1/2020.acl-main.346

25Citations

104Readers

Abstract

Self-attentive neural syntactic parsers using contextualized word embeddings (e.g. ELMo or BERT) currently produce state-of-the-art results in joint parsing and disfluency detection in speech transcripts. Since the contextualized word embeddings are pre-trained on a large amount of unlabeled data, using additional unlabeled data to train a neural model might seem redundant. However, we show that self-training - a semi-supervised technique for incorporating unlabeled data - sets a new state-of-the-art for the self-attentive parser on disfluency detection, demonstrating that self-training provides benefits orthogonal to the pre-trained contextualized word representations. We also show that ensembling self-trained parsers provides further gains for disfluency detection.

Cite

CITATION STYLE

APA

Lou, P. J., & Johnson, M. (2020). Improving disfluency detection by self-training a self-attentive model. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (pp. 3754–3763). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2020.acl-main.346

Improving disfluency detection by self-training a self-attentive model

Abstract

Cite

Register to see more suggestions