Morphological verb-aware tibetan language model

3Citations
Citations of this article
15Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

The Tibetan language model (TLM) is the key to Tibetan natural language processing. In this paper, we first observe that, different from widely used languages, Tibetan contains many morphological verbs that rarely appear in natural sentences but play a key role in accurate text prediction. This property is usually ignored by existing methods and makes traditional training strategies less effective in constructing accurate and robust TLMs. Hence, we propose a morphological verb-aware TLM by offline learning via a character frequency reweighting strategy and online tuning of discriminative weights conditioned on morphological verbs. However, because of the influence of morphological verbs on the tense and semantics of sentences, it is necessary to consider the morphological verbs in Tibetan. As a result, compared with state-of-the-art methods, our method not only reduces the perplexity but also improves the character error on tasks of the text prediction and automatic speech recognition (ASR).

Cite

CITATION STYLE

APA

Khysru, K., Jin, D., & Dang, J. (2019). Morphological verb-aware tibetan language model. IEEE Access, 7, 72896–72904. https://doi.org/10.1109/ACCESS.2019.2919328

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free