MTL-SLT: Multi-Task Learning for Spoken Language Tasks

16Citations
Citations of this article
43Readers
Mendeley users who have this article in their library.

Abstract

Language understanding in speech-based systems has attracted extensive interest from both academic and industrial communities in recent years with the growing demand for voice-based applications. Prior works focus on independent research by the automatic speech recognition (ASR) and natural language processing (NLP) communities, or on jointly modeling the speech and NLP problems focusing on a single dataset or single NLP task. To facilitate the development of spoken language research, we introduce MTL-SLT, a multi-task learning framework for spoken language tasks. MTL-SLT takes speech as input, and outputs transcription, intent, named entities, summaries, and answers to text queries, supporting the tasks of spoken language understanding, spoken summarization and spoken question answering respectively. The proposed framework benefits from three key aspects: 1) pre-trained sub-networks of ASR model and language model; 2) multitask learning objective to exploit shared knowledge from different tasks; 3) end-to-end training of ASR and downstream NLP task based on sequence loss. We obtain state-of-the-art results on spoken language understanding tasks such as SLURP and ATIS. Spoken summarization results are reported on a new dataset: Spoken-Gigaword.

Cite

CITATION STYLE

APA

Huang, Z., Rao, M., Raju, A., Zhang, Z., Bui, B., & Lee, C. (2022). MTL-SLT: Multi-Task Learning for Spoken Language Tasks. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (pp. 120–130). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2022.nlp4convai-1.11

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free