Abstract
We introduce a data augmentation technique based on byte pair encoding and a BERT-like self-attention model to boost performance on spoken language understanding tasks. We compare and evaluate this method with a range of augmentation techniques encompassing generative models such as VAEs and performance-boosting techniques such as synonym replacement and back-translation. We show our method performs strongly on domain and intent classification tasks for a voice assistant and in a user-study focused on utterance naturalness and semantic similarity.
Cite
CITATION STYLE
Yerukola, A., Bretan, M., & Jin, H. (2021). Data augmentation for voice-assistant NLU using BERT-based interchangeable rephrase. In EACL 2021 - 16th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference (pp. 1852–1860). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.eacl-main.159
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.