Disfluency correction using unsupervised and semi-supervised learning

Nikhil Saini; Drumil Trivedi; Shreya Khare; Tejas I. Dhamecha; Preethi Jyothi; Samarth Bharadwaj; Pushpak Bhattacharyya

Conference ProceedingsOPEN ACCESS

Disfluency correction using unsupervised and semi-supervised learning

EACL 2021 - 16th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference (2021) 3421-3427

DOI: 10.18653/v1/2021.eacl-main.299

5Citations

58Readers

Abstract

Spoken language is different from the written language in its style and structure. Disfluencies that appear in transcriptions from speech recognition systems generally hamper the performance of downstream NLP tasks. Thus, a disfluency correction system that converts disfluent to fluent text is of great value. This paper introduces a disfluency correction model that translates disfluent to fluent text by drawing inspiration from recent encoder-decoder unsupervised style-transfer models for text. We also show considerable benefits in performance when utilizing a small sample of 500 parallel disfluent-fluent sentences in a semi-supervised way. Our unsupervised approach achieves a BLEU score of 79.39 on the Switchboard corpus test set, with further improvement to a BLEU score of 85.28 with semi-supervision. Both are comparable to two competitive fully-supervised models.

Cite

CITATION STYLE

APA

Saini, N., Trivedi, D., Khare, S., Dhamecha, T. I., Jyothi, P., Bharadwaj, S., & Bhattacharyya, P. (2021). Disfluency correction using unsupervised and semi-supervised learning. In EACL 2021 - 16th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference (pp. 3421–3427). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.eacl-main.299

Disfluency correction using unsupervised and semi-supervised learning

Abstract

Cite

Register to see more suggestions