Using paraphrasing and memory-augmented models to combat data sparsity in question interpretation with a virtual patient dialogue system

11Citations
Citations of this article
77Readers
Mendeley users who have this article in their library.

Abstract

When interpreting questions in a virtual patient dialogue system, one must inevitably tackle the challenge of a long tail of relatively infrequently asked questions. To make progress on this challenge, we investigate the use of paraphrasing for data augmentation and neural memory-based classification, finding that the two methods work best in combination. In particular, we find that the neural memory-based approach not only outperforms a straight CNN classifier on low frequency questions, but also takes better advantage of the augmented data created by paraphrasing, together yielding a nearly 10% absolute improvement in accuracy on the least frequently asked questions.

References Powered by Scopus

WordNet: A Lexical Database for English

11668Citations
N/AReaders
Get full text

Convolutional neural networks for sentence classification

8030Citations
N/AReaders
Get full text

Note on the sampling error of the difference between correlated proportions or percentages

3155Citations
N/AReaders
Get full text

Cited by Powered by Scopus

Text mining in education

157Citations
N/AReaders
Get full text

Chatbot Interaction with Artificial Intelligence: human data augmentation with T5 and language transformer ensemble for text classification

37Citations
N/AReaders
Get full text

AuGPT: Auxiliary Tasks and Data Augmentation for End-To-End Dialogue with Pre-Trained Language Models

28Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Jin, L., King, D., Hussein, A., White, M., & Danforth, D. (2018). Using paraphrasing and memory-augmented models to combat data sparsity in question interpretation with a virtual patient dialogue system. In Proceedings of the 13th Workshop on Innovative Use of NLP for Building Educational Applications, BEA 2018 at the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HTL 2018 (pp. 13–23). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/w18-0502

Readers over time

‘18‘19‘20‘21‘22‘23‘24‘2506121824

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 21

68%

Researcher 6

19%

Professor / Associate Prof. 2

6%

Lecturer / Post doc 2

6%

Readers' Discipline

Tooltip

Computer Science 28

74%

Linguistics 6

16%

Engineering 3

8%

Neuroscience 1

3%

Save time finding and organizing research with Mendeley

Sign up for free
0