Virtual Assistants can be quite literal at times. If a user says tell Bob I love him, most virtual assistants will extract the message I love him and send it to the user’s contact named Bob, rather than properly converting the message to I love you. We designed a system that takes a voice message from one user, converts the point of view of the message, and then delivers the result to its target user. We developed a rule-based model, which integrates a linear text classification model, part-of-speech tagging, and constituency parsing with rule-based transformation methods. We also investigated Neural Machine Translation (NMT) approaches, including traditional recurrent networks, CopyNet, and T5. We explored 5 metrics to gauge both naturalness and faithfulness automatically, and we chose to use BLEU plus METEOR for faithfulness, as well as relative perplexity using a separately trained language model (GPT) for naturalness. Transformer-Copynet and T5 performed similarly on faithfulness metrics, with T5 scoring 63.8 for BLEU and 83.0 for METEOR. CopyNet was the most natural, with a relative perplexity of 1.59. CopyNet also has 37 times fewer parameters than T5. We have publicly released our dataset, which is composed of 46,565 crowd-sourced samples.
CITATION STYLE
Lee, I. G., Zu, V., Buddi, S. S., Liang, D., Kulkarni, P., & FitzGerald, J. G. M. (2020). Converting the point of view of messages spoken to virtual assistants. In Findings of the Association for Computational Linguistics Findings of ACL: EMNLP 2020 (pp. 154–163). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2020.findings-emnlp.15
Mendeley helps you to discover research relevant for your work.