A key bottleneck for developing dialog models is the lack of adequate training data. Due to privacy issues, dialog data is even scarcer in the health domain. We propose a novel method for creating dialog corpora which we apply to create doctor-patient interaction data. We use this data to learn both a generation and a hybrid classification/retrieval model and find that the generation model consistently outperforms the hybrid model. We show that our data creation method has several advantages. Not only does it allow for the semi-automatic creation of large quantities of training data. It also provides a natural way of guiding learning and a novel method for assessing the quality of human-machine interactions.
CITATION STYLE
Liednikova, A., Jolivet, P., Durand-Salmon, A., & Gardent, C. (2020). Learning Health-Bots from Training Data that was Automatically Created using Paraphrase Detection and Expert Knowledge. In COLING 2020 - 28th International Conference on Computational Linguistics, Proceedings of the Conference (pp. 638–648). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2020.coling-main.55
Mendeley helps you to discover research relevant for your work.