Exploiting the Tibetan Radicals in Recurrent Neural Network for Low-Resource Language Models

4Citations
Citations of this article
4Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In virtue of the superiority of handling the sequence data and the effectiveness of preserving long-distance information, recurrent neural network language model (RNNLM) has prevailed in a range of tasks in recent years. However, a large quantities of data are required for language modelling with good performance, which poses the difficulties of modeling for low-resource languages. To address this issue, Tibetan as one of minority languages is instantiated, and its radicals (components of Tibetan characters) are explored for constructing language model. Motivated by the inherent structure of Tibetan, a novel construction of Tibetan character embedding is exploited to RNNLM. The fusion of individual radical embedding is enhanced by three ways, including using uniform weight (TRU), different weights (TRD) and radical combination (TRC). This structure, especially combining with the radicals, can extend the capability to capture long-term context dependencies and solve the low-resource problem to some extent. The experimental results suggest that this proposed structure obtained a better performance than standard RNNLM, yielding 7.4%, 12.7% and 13.5% relative perplexity reduction by using TRU, TRD and TRC respectively.

Cite

CITATION STYLE

APA

Shen, T., Wang, L., Chen, X., Khysru, K., & Dang, J. (2017). Exploiting the Tibetan Radicals in Recurrent Neural Network for Low-Resource Language Models. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10635 LNCS, pp. 266–275). Springer Verlag. https://doi.org/10.1007/978-3-319-70096-0_28

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free