Auto-dialabel: Labeling dialogue data with unsupervised learning

39Citations
Citations of this article
125Readers
Mendeley users who have this article in their library.

Abstract

The lack of labeled data is one of the main challenges when building a task-oriented dialogue system. Existing dialogue datasets usually rely on human labeling, which is expensive, limited in size, and in low coverage. In this paper, we instead propose our framework auto-dialabel to automatically cluster the dialogue intents and slots. In this framework, we collect a set of context features, leverage an autoencoder for feature assembly, and adapt a dynamic hierarchical clustering method for intent and slot labeling. Experimental results show that our framework can promote human labeling cost to a great extent, achieve good intent clustering accuracy (84.1%), and provide reasonable and instructive slot labeling results.

Cite

CITATION STYLE

APA

Shi, C., Chen, Q., Sha, L., Li, S., Sun, X., Wang, H., & Zhang, L. (2018). Auto-dialabel: Labeling dialogue data with unsupervised learning. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018 (pp. 684–689). Association for Computational Linguistics. https://doi.org/10.18653/v1/d18-1072

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free