Extending the classifier algorithms in machine learning to improve the performance in spoken language understanding systems under deficient training data

Sheetal Jagdale; Milind Shah

Journal ArticleOPEN ACCESS

Extending the classifier algorithms in machine learning to improve the performance in spoken language understanding systems under deficient training data

Advances in Science, Technology and Engineering Systems (2020) 5(6) 464-471

DOI: 10.25046/aj050655

2Citations

5Readers

Abstract

One of the open domain challenges for Spoken Dialogue System (SDS) is to maintain a natural conversation for rarely visited domain i.e. domain with fewer data. Spoken Language Understanding (SLU) is a component of SDS that converts user utterance into a semantic form that a computer can understand. If we scale SDS open domain challenge to SLU then it should be able to convert user utterance to a semantic form even if less data is available for the rarest visited domain. The SLU reported in literature incorporate classifiers for the task of identifying the domain of user utterance, understanding the intent of the user, and filling slots-value pair. Thus, to address open domain challenges, classifiers in SLU must be robust to scarce training data. This paper presents investigations to improve the performance of SLU to convert user utterance into semantic form even if less training data is available. Eleven classification algorithms from machine learning have experimented under deficient data. The evaluation matrices used are accuracy, f-score, and inter cross-entropy. Comprehensive experimentation is carried out on the two publicly available datasets DSTC2 and DSTC3 were carried out.The accuracy for Support Vector Machine (SVM), Stochastic Gradient Descent (SGD) and Decision tree are 0.940, 0.960, 0.955 for DSTC2 and 0.916, 0.900, 0.901 for DSTC3 database respectively. The F-score for SVM, SGD and Decision tree are 0.855, 0.868, 0.849 for DSTC2 dataset and 0.725, 0.715, 0.700 for DSTC3 database, respectively. The ICE for SVM and SGD are 1.191,1.100 for DSTC2 dataset and 3.180,2.999 for DSTC3 database, respectively. The performance of SLU based on SVM and SGD was found to be the best among all. The worst performance in terms of all three evaluation metrics was displayed by SLU incorporating Automatic Relevance Determination (ARD) and Relevance Vector Machine (RVC) classifier.

Author supplied keywords

Cite

CITATION STYLE

APA

Jagdale, S., & Shah, M. (2020). Extending the classifier algorithms in machine learning to improve the performance in spoken language understanding systems under deficient training data. Advances in Science, Technology and Engineering Systems, 5(6), 464–471. https://doi.org/10.25046/aj050655

Extending the classifier algorithms in machine learning to improve the performance in spoken language understanding systems under deficient training data

Abstract

Author supplied keywords

Cite

Register to see more suggestions