Using paraphrasing and memory-augmented models to combat data sparsity in question interpretation with a virtual patient dialogue system

Lifeng Jin; David King; Amad Hussein; Michael White; Douglas Danforth

Conference ProceedingsOPEN ACCESS

Using paraphrasing and memory-augmented models to combat data sparsity in question interpretation with a virtual patient dialogue system

Proceedings of the 13th Workshop on Innovative Use of NLP for Building Educational Applications, BEA 2018 at the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HTL 2018 (2018) 13-23

DOI: 10.18653/v1/w18-0502

11Citations

77Readers

Abstract

When interpreting questions in a virtual patient dialogue system, one must inevitably tackle the challenge of a long tail of relatively infrequently asked questions. To make progress on this challenge, we investigate the use of paraphrasing for data augmentation and neural memory-based classification, finding that the two methods work best in combination. In particular, we find that the neural memory-based approach not only outperforms a straight CNN classifier on low frequency questions, but also takes better advantage of the augmented data created by paraphrasing, together yielding a nearly 10% absolute improvement in accuracy on the least frequently asked questions.

References Powered by Scopus

View more at Scopus

Cited by Powered by Scopus

View more at Scopus

Cite

CITATION STYLE

APA

Jin, L., King, D., Hussein, A., White, M., & Danforth, D. (2018). Using paraphrasing and memory-augmented models to combat data sparsity in question interpretation with a virtual patient dialogue system. In Proceedings of the 13th Workshop on Innovative Use of NLP for Building Educational Applications, BEA 2018 at the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HTL 2018 (pp. 13–23). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/w18-0502

Readers over time

Readers' Seniority

PhD / Post grad / Masters / Doc 21

68%

Researcher 6

19%

Professor / Associate Prof. 2

Lecturer / Post doc 2

Readers' Discipline

Computer Science 28

74%

Linguistics 6

16%

Engineering 3

Neuroscience 1

Using paraphrasing and memory-augmented models to combat data sparsity in question interpretation with a virtual patient dialogue system

Abstract

References Powered by Scopus

WordNet: A Lexical Database for English

Convolutional neural networks for sentence classification

Note on the sampling error of the difference between correlated proportions or percentages

Cited by Powered by Scopus

Text mining in education

Chatbot Interaction with Artificial Intelligence: human data augmentation with T5 and language transformer ensemble for text classification

AuGPT: Auxiliary Tasks and Data Augmentation for End-To-End Dialogue with Pre-Trained Language Models

Register to see more suggestions

Cite

Readers over time

Readers' Seniority

Readers' Discipline