Deep learning architecture for part-of-speech tagging with word and suffix embeddings

10Citations
Citations of this article
10Readers
Mendeley users who have this article in their library.
Get full text

Abstract

This paper presents a recurrent neural network (RNN) for part-of-speech (POS) tagging. The variation of RNN used is a Bidirectional Long Short-Term Memory architecture, which solves two crucial problems: the vanishing gradients phenomenon, which is architecturespecific, and the dependence of POS labels on sequential information both preceding and subsequent to them, which is task-specific. The approach is attractive compared to other machine learning approaches in that it does not require hand-crafted features or purposebuilt resources such as a morphological dictionary. The study presents preliminary results on the BulTreeBank corpus, with a tagset of 153 labels. One of its main contributions is the training of distributed word representations (word embeddings) against a large corpus of Bulgarian text. Another is complementing the word embedding input vectors with distributed morphological representations (suffix embeddings), which are shown to significantly improve the accuracy of the system.

Cite

CITATION STYLE

APA

Popov, A. (2016). Deep learning architecture for part-of-speech tagging with word and suffix embeddings. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9883 LNAI, pp. 68–77). Springer Verlag. https://doi.org/10.1007/978-3-319-44748-3_7

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free