Enabling spoken dialogue systems for low-resourced languages—End-to-end dialect recognition for north sami

3Citations
Citations of this article
11Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In this paper, we tackle the challenge of identifying dialects using deep learning for under-resourced languages. Recent advances in spoken dialogue technology have been strongly influenced by the availability of big corpora, while our goal is to work on the spoken interactive application for the North Sami language, which is classified as one of the less-resourced languages spoken in Northern Europe. North Sami has various variations and dialects which are influenced by the majority languages of the areas in which it is spoken: Finnish and Norwegian. To provide reliable and accurate speech components for an interactive system, it is important to recognize the speakers with their Finnish or Norwegian accent. Conventional approaches compute universal statistical models which require a large amount of data to form reliable statistics, and thus they are vulnerable to small data where there is only a limited number of utterances and speakers available. In this paper we will discuss dialect and accent recognition in under-resourced context, and focus on training an attentive network for leveraging unlabeled data in a semi-supervised scenario for robust feature learning. Validation of our approach is done via two DigiSami datasets: conversational and read corpus.

Cite

CITATION STYLE

APA

Trong, T. N., Jokinen, K., & Hautamäki, V. (2019). Enabling spoken dialogue systems for low-resourced languages—End-to-end dialect recognition for north sami. In Lecture Notes in Electrical Engineering (Vol. 579, pp. 221–235). Springer. https://doi.org/10.1007/978-981-13-9443-0_19

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free